Re: [R] custom graphing of box and whisker plots
On Jun 28, 2012 at 9:08pm Brianna Wright wrote: I'm trying to graph some data in a boxplot-like style, but I want to set the box and whisker limits myself... See ?bxp and ?boxplot for the format of the data you need to pass to bxp. (You just have to add the median to what you have provided in your example.) Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/custom-graphing-of-box-and-whisker-plots-tp4634826p4634845.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: How to best analyze dataset with zero-inflated loglinear dependent variable?
On Jun 08, 2012 at 5:37pm Heidi Bertels wrote [fwd: Iuri Gavronski]: The dependent variable is problematic because of its distribution. It is zero-inflated (many projects did not receive funding), and for those projects that did receive funding, the distribution is loglinear. I believe this is called a zero-inflated loglinear continuous dependent variable. Look at package gamlss, where you might find something. It has a number of zero-inflated and zero-adjusted distributions. Package VGAM might also fit this. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Fwd-How-to-best-analyze-dataset-with-zero-inflated-loglinear-dependent-variable-tp4632829p4632884.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] meaning of sigma from LM, is it the same as RMSE
On Apr 05, 2012; 1:47am John Sorkin wrote: Is the sigma from a lm...the RMSE (root mean square error) John, RMSE is usually calculated using the number of observations/cases, whereas summary.lm()$sigma is calculated using the residual degrees of freedom. See below: ## Helps to study the output of anova() set.seed(231) x - rnorm(20, 2, .5) y - rnorm(20, 2, .7) T.lm - lm(y ~ x) summary(T.lm)$sigma [1] 0.7403162 anova(T.lm) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(F) x 1 0.0036 0.00360 0.0066 0.9363 Residuals 18 9.8652 0.54807 sum(resid(T.lm)^2) [1] 9.865225 sqrt(sum(resid(T.lm)^2)/18) [1] 0.7403162 sqrt(sum(resid(T.lm)^2)/20) ## RMSE (y = 20) [1] 0.7023256 ## OR sqrt(mean((y-fitted(T.lm))^2)) [1] 0.7023256 Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/meaning-of-sigma-from-LM-is-it-the-same-as-RMSE-tp4533515p4534165.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Principal Components for matrices with NA
On Feb 27, 2012 at 9:30pm Joyous Fisher wrote: Q: is there a way to do princomp or another method where every row has at least one missing column? You have several options. Try function nipals in packages ade4 and plspm. Also look at package pcaMethods (on Bioconductor), where you will find a full range of options for carrying out principal component analysis using matrices with missing values. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Principal-Components-for-matrices-with-NA-tp4425930p4427216.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in fitdistr
On Feb 23, 2012 at 11:19am Soheila wrote: Who can help me to solve this problem? est.chi[i,]-c( fitdistr(as.numeric(data2[,i]),chi-squared,start=list(df=1))$estimate) Warning message:In optim(x = c(7.86755, 7.50852, 7.86342, 7.70589, 7.70153, 7.58272, : one-diml optimization by Nelder-Mead is unreliable: use Brent or optimize() directly The warning message tells you to use Brent rather than the default Nelder-Mead. So do that. ## ?optim est.chi[i,]-c( fitdistr(as.numeric(data2[,i]), densfun=chi-squared, start=list(df=1), method=Brent)$estimate) Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/error-in-fitdistr-tp4413293p4413459.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why does the order of terms in a formula translate into different models/ model matrices?
On Jan 27, 2012; 6:29pm Ben Bolker wrote: My best (not very well-informed) guess is that there is something going on with automatic dropping of terms that appear to be aliased?? and that this test is (perhaps unintentionally) order-dependent. Looks to me like Ben is close to the mark here. From ?alias: Complete aliasing refers to effects in linear models that cannot be estimated independently of the terms which occur earlier in the model and so have their coefficients omitted from the fit. alias(m0, complete=T) Model : Y ~ A:B + x:A Complete : (Intercept) Aa1:Bb1 Aa2:Bb1 Aa1:Bb2 Aa2:Bb2 Aa1:Bb3 Aa2:Bb3 Aa1:Bb4 Aa1:x Aa2:x Aa2:Bb4 1 -1 -1 -1 -1 -1 -1 -1 0 0 alias(m1, complete=T) Model : Y ~ x:A + A:B However, if you fit proper (or statistically sensible models), then there is no problem reversing terms: logLik(m2 - lm(Y ~ A*B + x*A, dat)) 'log Lik.' -13.22186 (df=11) logLik(m3 - lm(Y ~ x*A + A*B, dat)) 'log Lik.' -13.22186 (df=11) Regards, Mark Difford - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Why-does-the-order-of-terms-in-a-formula-translate-into-different-models-model-matrices-tp4333408p4334385.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question about taylor.diagram in plotrix package
On Jan 20, 2012; 4:08pm baydap wrote: In the embedded figure you can see the label correlation is too close to the ticks. How can I move it and make it larger? I have hacked the function to do this and to return a table of relevant statistics. You are welcome to it, but I don't have access to it until tomorrow or the day after. I will send it to you off list. Note: It would be nice to have a real name and affiliation. Regards Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/a-question-about-taylor-diagram-in-plotrix-package-tp4313328p4315623.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 4th corner analysis ade4 - what do the colors mean
On Jan 21, 2012; 7:39am stephen sefick wrote: I plot the combined value with plot(four.comb, type=G). What do the colors mean? I have both grey and black bars. The help file for fourthcorner plainly tells you (sub Details). You can also work out the meaning by looking at the summary statistics: The function plot produces a graphical representation of the results (white for non siginficant, light grey for negative sgnificant and dark grey for positive suignficant relationships). Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/4th-corner-analysis-ade4-what-do-the-colors-mean-tp4315476p4315508.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unexpected results using the oneway_test in the coin package
On Jan 09, 2012 at 11:48am Christoph Liedtke wrote: I should be detecting some non-significance between groups I and III at least, but the test comes back with extremely low p-values. Where am I going wrong? Nowhere, I think. This does seem to be an error in coin. You should send your example to the maintainers of the package. Apart from the visual you have provided, other reasons for thinking that this is an error are the following. First, if you redo the analysis excluding habitat II, then the contrast is not significant, as expected. Secondly, if you repeat the full analysis using package nparcomp then you get the results you are expecting, based on the graphical representation of the data. See the examples below. ## drop habitat == II NDWD - oneway_test(breeding ~ habitat, data = droplevels(subset(mydata, habitat != II)), ytrafo = function(data) trafo(data, numeric_trafo = rank), xtrafo = function(data) trafo(data, factor_trafo = function(x) model.matrix(~x - 1) %*% t(contrMat(table(x), Tukey))), teststat = max, distribution = approximate(B = 90)) print(NDWD) print(pvalue(NDWD, method = single-step)) ## use nparcomp library(nparcomp) npar - nparcomp(breeding ~ habitat, data = mydata, type = Tukey) npar Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Unexpected-results-using-the-oneway-test-in-the-coin-package-tp4278371p4281329.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lavaan and Average Variance Extracted
On Dec 28, 2011 at 3:47am T.M. Rajkumar wrote: I need a way to get at the Variance Extracted information. Is there a simple way to do the calculation. Lavaan does not seem to output this. It does. See: library(lavaan) ?inspect inspect( fit, rsquare ) Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Lavaan-and-Average-Variance-Extracted-tp4238791p4239818.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PCA on high dimentional data
On Dec 10, 2011 at 5:56pm deb wrote: My question is, is there any way I can map the PC1, PC2, PC3 to the original conditions, so that i can still have a reference to original condition labels after PCA? deb, To add to what Stephen has said. Best to do read up on principal component analysis. Briefly, each PCA is composite variable, composed of different amounts of each and every one of your column variables, i.e. cond1, ..., cond1000. So the short answer to your question is no. There is no way to do this mapping, except as loadings on each principal component (PC). Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/PCA-on-high-dimentional-data-tp4180467p4180890.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to scale arrows to approximately fill a plot region?
On Dec 07, 2011 at 10:20pm Michael Friendly asked: How to scale arrows to approximately fill a plot region? Michael, Following Uwe...If you want code that does it then look at what Daniel Chessel did in package ade4: ## scatter.dudi function (x, xax = 1, yax = 2, clab.row = 0.75, clab.col = 1, permute = FALSE, posieig = top, sub = NULL, ...) { if (!inherits(x, dudi)) stop(Object of class 'dudi' expected) opar - par(mar = par(mar)) on.exit(par(opar)) coolig - x$li[, c(xax, yax)] coocol - x$c1[, c(xax, yax)] if (permute) { coolig - x$co[, c(xax, yax)] coocol - x$l1[, c(xax, yax)] } s.label(coolig, clab = clab.row) born - par(usr) k1 - min(coocol[, 1])/born[1] k2 - max(coocol[, 1])/born[2] k3 - min(coocol[, 2])/born[3] k4 - max(coocol[, 2])/born[4] k - c(k1, k2, k3, k4) coocol - 0.9 * coocol/max(k) s.arrow(coocol, clab = clab.col, add.p = TRUE, sub = sub, possub = bottomright) add.scatter.eig(x$eig, x$nf, xax, yax, posi = posieig, ratio = 1/4) } environment: namespace:ade4 Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/How-to-scale-arrows-to-approximately-fill-a-plot-region-tp4169871p4170400.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glht for lme object with significant interaction term
Nov 23, 2011 at 1:27am Andreas wrote: I would like to do multiple comparisons for treatment levels within day (i.e. across treatments for each day in turn) Andreas, The following does what you want. To see how/why it works, look at the vignette to package multcomp, where there is an example. Note that I had to change mean.on.active to mean_on_active. ## library(nlme); library(multcomp) Tdf - Your attached data set m.final-lme(mean.on.active ~ treat * day,random=~1|id,na.action=na.omit, data=Tdf) Tdf$IntFac - interaction(Tdf$treat, Tdf$day, drop=T) m.finalI-lme(mean.on.active ~ IntFac,random=~1|id,na.action=na.omit, data=Tdf) summary(m.final) summary(m.finalI) glht(m.finalI, linfct=mcp(IntFac = Tukey)) Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/glht-for-lme-object-with-significant-interaction-term-tp4097865p4099459.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Has anyone used SIAR package add on?
On Nov 12, 2011 at 8:29pm Alex wrote: Has anyone used SIAR package add on? I posted a reply to an earlier question from you on this subject. See http://r.789695.n4.nabble.com/Errors-in-SIAR-td4029804.html. In it I note that there are problems with the function from siar (not SIAR) you are using, but that this may not be your problem, that the function calls for matrices (you were using data frames), and that you are unlikely to get further help on this until you post your data (or data that resemble yours). It's not that people don't want to help you, but you have to give them something to work with (see the famous footer of this message). One of the demos in the siar package mostly works, the other one does not. It's possible that there is a minor glitch somewhere, which could easily be fixed, so that given data in the correct format you get a result. Why don't you dput() a subset of your data, so that anyone who is interested in helping you can have a go? If your data set is called myData, and is stored as a data frame, then do something like the following and copy the result of dput() into your next email. Of course, if your data set has many rows then you want to adjust the by argument (increase it). Twenty to thirty rows should be sufficient. myPartData - myData[seq(1, nrow(myDat), by=3), ] dput(myPartData) Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Has-anyone-used-SIAR-package-add-on-tp4035014p4036852.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Errors in SIAR
On Nov 11, 2011 at 1:06am Alex DJ wrote: Alex, I haven't followed this thread closely (now split), but this most recent request for help from you is very difficult to make sense of. I think we need access to your data set. Further, there appear to be problems with the function you are using in package siar[1], based on trying to run demos in the package. (Though this may not be the issue.) Two things you can try are the following: First, the function you are using (siarmcmcdirichletv4) calls for matrices, not data frames (which you are using). Usually this throws an error, so maybe this is not the problem. Secondly, if you have a factor then you turn it into an ordered factor by doing ordered(myFactor) where myFactor is the name of the factor you want to convert to an ordered factor. You can call the levels of your factor what you like. Here, reading up on these functions, and trying the examples, would help you to help yourself. Have you even looked at ?factor. The fact that you are new to R really means that you should be reading everything about R that you can lay your hands on. This will help you to ask sensible questions about things that still puzzle you. And if you want to convert a factor into a numeric then you have to do something like as.numeric(as.character(myFactor)) This is all documented in R. Have you read the manualAn Introduction to R, which comes with R? There is also plenty of good documentation on how to use R on the web and indeed on R`s homepage. Regards, Mark. [1] R is case-sensitive: the package is called siar, not SIAR. Please respect the author's designation. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Errors-in-SIAR-tp4029804p4030667.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate of intercept in loglinear model
On Nov 08, 2011 at 11:16am Colin Aitken wrote: An unresolved problem is: what does R do when the explanatory factors are not defined as factors when it obtains a different value for the intercept but the correct value for the fitted value? Colin, I don't think that happens (that the fitted values are identical if predictors are cast as numerical), but the following could (really is answered by my initial answer). Once again, using the example I gave above, but using the second level of outcome as a reference level for a new fit, called glm.D93R. (For this part of the question a corpse would have been nice, though not really needed---yours was unfortunately buried too deeply for me to find it,) ## Dobson (1990) Page 93: Randomized Controlled Trial : counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) glm.D93 - glm(counts ~ outcome + treatment, family=poisson()) glm.D93R - glm(counts ~ C(outcome, base=2) + treatment, family=poisson()) ## treat predictor as numeric glm.D93N - glm(counts ~ as.numeric(as.character(outcome)) + as.numeric(as.character(treatment)), family=poisson()) coef(glm.D93) (Intercept) outcome2 outcome3treatment2treatment3 3.044522e+00 -4.542553e-01 -2.929871e-01 1.337909e-15 1.421085e-15 ## Different value for the Intercept but same fitted values (see below) as the earlier fit (above) ## summary(glm.D93R) snipped and edited for clarity Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 2.590e+00 1.958e-01 13.230 2e-16 *** outcome1 4.543e-01 2.022e-01 2.247 0.0246 * outcome3 1.613e-01 2.151e-01 0.750 0.4535 treatment2-3.349e-16 2.000e-01 0.000 1. treatment3-6.217e-16 2.000e-01 0.000 1. snip fitted(glm.D93) 12345678 9 21.0 13.3 15.7 21.0 13.3 15.7 21.0 13.3 15.7 fitted(glm.D93R) 12345678 9 21.0 13.3 15.7 21.0 13.3 15.7 21.0 13.3 15.7 ## if predictors treated as numeric---check summary(glm.D93N) yourself fitted(glm.D93N) 12345678 9 19.40460 16.52414 14.07126 19.40460 16.52414 14.07126 19.40460 16.52414 14.07126 Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4017091.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate of intercept in loglinear model
On Nov 07, 2011 at 9:04pm Mark Difford wrote: So here the intercept represents the estimated counts... Perhaps I should have added (though surely unnecessary in your case) that exponentiation gives the predicted/estimated counts, viz 21 (compared to 18 for the saturated model). ## exp(3.044522) [1] 20.9 Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4012723.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate of intercept in loglinear model
On Nov 07, 2011 at 7:59pm Colin Aitken wrote: How does R estimate the intercept term \alpha in a loglinear model with Poisson model and log link for a contingency table of counts? Colin, If you fitted this using a GLM then the default in R is to use so-called treatment contrasts (i.e. Dunnett contrasts). See ?contr.treatment. Take the first example on the ?glm help page ## Dobson (1990) Page 93: Randomized Controlled Trial : counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) print(d.AD - data.frame(treatment, outcome, counts)) glm.D93 - glm(counts ~ outcome + treatment, family=poisson()) anova(glm.D93) summary(glm.D93) snip Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 3.045e+00 1.709e-01 17.815 2e-16 *** outcome2-4.543e-01 2.022e-01 -2.247 0.0246 * outcome3-2.930e-01 1.927e-01 -1.520 0.1285 treatment2 1.338e-15 2.000e-01 0.000 1. treatment3 1.421e-15 2.000e-01 0.000 1. snip levels(outcome) [1] 1 2 3 levels(treatment) [1] 1 2 3 So here the intercept represents the estimated counts at the first level of outcome (i.e. outcome = 1) and the first level of treatment (i.e. treatment = 1). predict(glm.D93, newdata=data.frame(outcome=1, treatment=1)) 1 3.044522 Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4012346.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate of intercept in loglinear model
Nov 08, 2011; 4:58am Rolf Turner wrote: (in response to Professor Colin Aitken, Professor of Forensic Statistics, !!!) SNIP Do you suppose you could provide a data-corpse for us to dissect? Fortune nomination!!! I think Sherlock would have said, But it's elementary, my dear Watson. Oftentimes a corpse is not necessary, as here. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4015131.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] testing significance of axis loadings from multivariate dudi.mix
On Nov 05, 2011 at 11:01pm Francisco Mora Ardila wrote: But a problem arised with the predict function: it doesn´t seem to work with an object from dudi.mix and I dont understand why. Francisco, There is no predict() method for dudi.mix() or for any of the dudi objects in ade4. I don't see why you can't get around this by doing something like the following, but you need to take account of any scaling/centring that you might do to your data before calling dudi.mix(). ## Does a dudi.mix on continuous data, so really equals a dudi.pca/princomp/PCA library(ade4) data(deug) deug.dudi - dudi.mix(deug$tab, scann=F, nf=2) tt - as.matrix(deug.dudi$tab) %*% as.matrix(deug.dudi$c1) ## see note below qqplot(deug.dudi$li[,1], tt[,1]) qqplot(deug.dudi$li[,2], tt[,2]) deug.princ - princomp(deug$tab, cor=T) qqplot(predict(deug.princ)[,1], tt[,1]) ## scaling not accounted for: deug.princ - princomp(deug$tab, cor=F) qqplot(predict(deug.princ)[,1], tt[,1]) rm(tt, deug.dudi, deug.princ) Note that in the code given above, as.matrix(deug.dudi$tab) %*% as.matrix(deug.dudi$c1) is based on how stats:::predict.princomp does it. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/testing-significance-of-axis-loadings-from-multivariate-dudi-mix-tp3994281p3995350.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determining r2 values for a SEM
On Nov 04, 2011 at 6:55pm Katherine Stewart wrote: Is there a way to determine r2 values for an SEM in the SEM package or another way to get these values in R? Katherine, rsquare.sem() in package sem.additions will do it for you. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Determining-r2-values-for-a-SEM-tp3990855p3991279.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contrasts with an interaction. How does one specify the dummy variables for the interaction
John, There is a good example of one way of doing this in multcomp-examples.pdf of package multcomp. See pages 8 to 10. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Contrasts-with-an-interaction-How-does-one-specify-the-dummy-variables-for-the-interaction-tp3948792p3950225.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Levenshtein-Distance
On Oct 20, 2011; 10:07am Jörg Reuter wrote: I want compare a Matrix row by row and at the end I want to a Matrix with the Levenshtein-Distance. Jörg, To begin with, try the following at the command prompt: ## RSiteSearch(Levenshtein) Shows, amongst other hits, that package vwr has a function to calculate Levenshtein distances. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Levenshtein-Distance-tp3920951p3921252.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ordered probit model -marginal effects and relative importance of each predictor-
On Aug 27, 2011 franco salerno wrote: ...problem with ordered probit model -polr function (library MASS). a) how to calculate marginal effects b) how to calculate the relative importance of each independent variables Hi Franco, Have a look at John Fox's effects package (for a), and the Anova() function in his car package (for b). Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Ordered-probit-model-marginal-effects-and-relative-importance-of-each-predictor-tp3773504p3774060.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction
On Aug 19, 2011 khosoda wrote: I used x10.homals4$objscores[, 1] as a predictor for logistic regression as in the same way as PC1 in PCA. Am I going the right way? Hi Kohkichi, Yes, but maybe explore the sets= argument (set Response as the target variable and the others as the predictor variables). Then use Dim1 scores. Also think about fitting a rank-1 restricted model, combined with the sets= option. See the vignette to the package and look at @ARTICLE{MIC98, author = {Michailides, G. and de Leeuw, J.}, title = {The {G}ifi system of descriptive multivariate analysis}, journal = {Statistical Science}, year = {1998}, volume = {13}, pages = {307--336}, abstract = {} } Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3755163.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction
On Aug 17, 2011 khosoda wrote: 1. Is it O.K. to perform PCA for data consisting of 1 continuous variable and 8 binary variables? 2. Is it O.K to perform transformation of age from continuous variable to factor variable for MCA? 3. Is mjca1$rowcoord[, 1] the correct values as a predictor of logistic regression model like PC1 of PCA? Hi Kohkichi, If you want to do this, i.e. PCA-type analysis with different variable-types, then look at dudi.mix() in package ade4 and homals() in package homals. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3752168.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction
On Aug 18, 2011; Daniel Malter wrote: Pooling nominal with numeric variables and running pca on them sounds like conceptual nonsense to me. Hi Daniel, This is not true. There are methods that are specifically designed to do a PCA-type analysis on mixed categorical and continuous variables, viz dudi.mix and dudi.hillsmith in package ade4. De Leeuw's homals method takes this a step further, doing amongst other things, a non-linear version of PCA using any type of variable. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3752516.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting vastly different results when running GLMs
On Aug 17, 2011; 5:43pm Luke Duncan wrote: Hi Luke, The differences you are seeing are almost certainly due to different contrast codings: Statistica probably uses sum-to-zero contrasts whereas R uses treatment (Dunnett) contrasts by default. You would be well advised to consult a local statistician for a deeper understanding. For some immediate insight do the following: ## Fits your model with different contrasts + a few other things. ## library(car) ?contrast ?contr.treatment model1 - glm((cbind(spec,total)) ~ behav * loc, family=binomial, data=behdata, contrasts=list(behav=contr.treatment, loc=contr.treatment)) model2 - glm((cbind(spec,total)) ~ behav * loc, family=binomial, data=behdata, contrasts=list(behav=contr.sum, loc=contr.sum)) summary(model1) summary(model2) anova(model1, model2) ## see: models seem different but are identical ## Type I SS anova(model1) anova(model2) ## Type II SS library(car) Anova(model1, type=II) Anova(model2, type=II) Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Getting-vastly-different-results-when-running-GLMs-tp3750496p3751115.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression - how to deal with past values?
On Aug 16, 2011 Eduardo M. A. M.Mendes wrote: Can I gather from your email that there is nothing available on R that deals with dynamic models (k-step ahead and free-run)? Eduardo, There may be others, but try package dlm (Bayesian and Likelihood Analysis of Dynamic Linear Models) by Giovanni Petris Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Regression-how-to-deal-with-past-values-tp3745817p3747302.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GLM different results with the same factors
On Jul 27, 2011 gaiarrido wrote: I've been reading these days about what you tell me, but i don't understand properly. How could I know, with this tests, which variables are significant?... Mario, You need to get in touch with a statistician at your university. You are fitting quite a complex analysis of covariance-type of model. This means (i.e. analysis of covariance) that the influence of edadysexo (your categorical variable) on your response variable is being assessed _after_ adjusting for or taking account of the influence of lcc. Did you try ## library(car) Anova(modo) as I advised you to? All I can say for the moment is that you have a significant interaction between edadysexo and mes (I don't know what mes is). Therefore, in general you can't talk about main effects for either. It is also clear that lcc will not be significant when the Type II analysis of variance test that I advised you to do is carried out. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/GLM-different-results-with-the-same-factors-tp3690471p3700799.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question about glht function
On Jul 26, 2011 Lao Meng wrote: glht(result, linfct = mcp(f_GROUP=Tukey) ) Error in `[.data.frame`(mf, nhypo[checknm]) : undefined columns selected It is almost certainly the underscore in the name (_) that is causing the problem. Try putting the term in quotes (f_GROUP). Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/a-question-about-glht-function-tp3695038p3695067.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GLM different results with the same factors
On Jul 24, 2011 Gaiarrido wrote: Why the order of the factors give different results? I suppose it's because the order of the factors, i've just changed lcc from the first position to the last in the model, and the significance change completely ...snip Ijow can i know what's correct? Both are correct. R's default anova uses sequential sums of squares and so tests model terms sequentially. Reordering the factors in your model therefore leads to different hypotheses being tested. You are looking for what are often called Type II tests. To get them use drop1() on your glm object or install the car package and use its Anova function. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/GLM-different-results-with-the-same-factors-tp3690471p3690556.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Latex Table Help on R
On Jul 21, 2011 Jim Silverton wrote: However, I would like the standard deviations under the means in brackets. Can anyone check this code to see how this can be adjusted? Jim, You need to use underset, a LaTeX command. The bare-bones call is $\underset{}{}$, where the underset value goes in the first curly and your main value goes in the second curly (i.e. is typeset above the underset). I don't use xtable but rather use Ron Harrell's functions in Hmisc package, then pass it through his latex() function, so can't take you further. ## paste('$\\underset','{',data$SDs,'}','{',data$means,'}$', sep=) Hope this gets you going. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Latex-Table-Help-on-R-tp3682951p3683038.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package rrcov, functions PcaCov, PcaHubert, PcaGrid
On Jul 13, 2011; 12:03pm Albina wrote: Error: diff(sv)0 ist not all TRUE ... . snip The same error occurs with the other functions. What does this mean and how can I perform the robust PCA with these functions by using a quadratic matrix as input? Albina, You at least need to feed it something sensible. Look at your matrix: x-matrix(0.5,30,30) x Try the following: x - rmultinom(30, size = 30, prob=rep(c(0.1,0.2,0.8), 30)) PcaCov(x) Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Package-rrcov-functions-PcaCov-PcaHubert-PcaGrid-tp3664644p3664961.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ordiellipse
On May 26, 2011 Andrew Halford wrote: I am using ordiellipse to plot the ellipses and it works fine except one of my groups contains only 2 sites and I cannot get an ellipse around them. I'm assuming that 2 points is not enough to perform the relevant calculations here, however I would like to plot one if I could, if only for the sake of pictorial consistency. Ouch! for the rod that is likely to come. Advice? Collect more data, for the sake of pictorial consistency. And if you can't then you can't. What you have are the (available) facts. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/ordiellipse-tp3551694p3551760.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multinomial Logistical Model
On May 24, 2011; 11:06pm Belle wrote: Does anyone know how to run Multinomial logistical Model in R in order to get predicted probability? Yes. I could stop there but you shouldn't. The author of the package provides plenty of examples (and two good vignettes) showing you how to do this. Suggest you do some work in that area. Look especially at how model formulas are used/specified. This is at least one area where you have gone wrong, as the error message clearly tells you. Good luck. Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Multinomial-Logistical-Model-tp3548239p3549611.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Panels order in lattice graphs
May 04, 2011; 5:50pm Cristina Silva wrote: In lattice graphs, panels are drawn from left to right and bottom to top. The flag as.table=TRUE changes to left to right and top to bottom. Is there any way to change to first top to bottom and then left to right? didn´t find anything neither in Help pages nor Lattice book. Cristina, You have not fully explained your problem. An approach I use for difficult-to-get-right arrangements is the following. Say you have two conditioning variables (13 panels in all) and you want to place the last panel on the top row in the first position on the bottom row, but leave everything else the same, then easiest is the following: ## Note: T.xyplot$index.cond is a __list__, so you need to use [[ T.xyplot - xyplot(Prop ~ OM | interaction(Treatment, Aspect, drop = TRUE), data = myDat) print(T.xyplot) T.xyplot$index.cond [[1]] [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 T.xyplot$index.cond[[1]] - c(13, 1:12) print(T.xyplot) Hope this helps to solve your problem. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/Panels-order-in-lattice-graphs-tp3496022p3497691.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combine lattice plot and standard R plot
On May 04, 2011 at 8:26pm Lucia Cañas wrote: I would like to combine lattice plot (xyplot) and standard R plot (plot and plotCI) in an unique figure. Hi Lucia, Combining the two systems can be done. See: Paul Murrell. Integrating grid graphics output with base graphics output. R News, 3(2):7-12, October 2003 http://cran.r-project.org/doc/Rnews/Rnews_2003-2.pdf Hope this helps. Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/combine-lattice-plot-and-standard-R-plot-tp3496409p3497717.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bwplot in ascending order
On May 01 (2011) Harold Doran wrote: Can anyone point me to examples with R code where bwplot in lattice is used to order the boxes in ascending order? You don't give an example and what you want is not entirely clear. Presumably you want ordering by the median (boxplot, and based on the example you point to, where the median is mentioned as an _example_). Is this what you want? ## bwplot(var1 ~ var2|condition, dat, index.cond = function(x, y) reorder(y, x, median)) ## if x is numeric bwplot(var1 ~ var2|condition, dat, index.cond = function(x, y) reorder(x, y, median)) ## if y is numeric Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/bwplot-in-ascending-order-tp3488557p3489544.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with pairs() in nlme
Apr 08, 2011; 11:05pm dgmaccon wrote: I get the same error: Error in function (classes, fdef, mtable) : unable to find an inherited method for function lmList, for signature formula, nfnGroupedData I get no such error. You need to provide more information (platform c.) ## library(nlme) OrthoFem-Orthodont[Orthodont$Sex==Female,] fm1OrthF.lis-lmList( distance ~ age,data = OrthoFem) fm1OrthF.lis Call: Model: distance ~ age | Subject Data: OrthoFem Coefficients: (Intercept) age F10 13.55 0.450 F09 18.10 0.275 F06 17.00 0.375 F01 17.25 0.375 F05 19.60 0.275 F07 16.95 0.550 F02 14.20 0.800 F08 21.45 0.175 F03 14.40 0.850 F04 19.65 0.475 F11 18.95 0.675 Degrees of freedom: 44 total; 22 residual Residual standard error: 0.6682746 Regards, Mark. sessionInfo() R version 2.13.0 Under development (unstable) (2011-03-15 r54806) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_South Africa.1252 LC_CTYPE=English_South Africa.1252 [3] LC_MONETARY=English_South Africa.1252 LC_NUMERIC=C [5] LC_TIME=English_South Africa.1252 attached base packages: [1] grid splines stats graphics grDevices utils datasets methods base other attached packages: [1] nlme_3.1-100lattice_0.19-17 rms_3.3-0 Hmisc_3.8-3 survival_2.36-5 loaded via a namespace (and not attached): [1] cluster_1.13.3 effects_2.0-12 tools_2.13.0 -- View this message in context: http://r.789695.n4.nabble.com/R-Problem-with-pairs-in-nlme-tp813548p3438266.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm: modelling zeros as binary and non-zeroes as coming from a continuous distribution
On Mar 30, 2011; 11:41am Mikhail wrote: I'm wondering if there's any way to do the same in R (lme can't deal with this, as far as I'm aware). You can do this using the pscl package. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/glm-modelling-zeros-as-binary-and-non-zeroes-as-coming-from-a-continuous-distribution-tp3417718p3417857.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey HSD test using Multcomp
Mar 25, 2011; 12:58am Simon Bate wrote: I've been happily using the TukeyHSD function to produce Tukeys HSD tests but have decided to try out Multcomp instead. However when I carry out the test repeatedly I have found that Multcomp produces slightly different values each time. (see code below). The random number generator comes into this so you need to ?set.seed to get an identical result. Insert a set.seed statement in your code: ## for (i in 1:20) { set.seed(3)## add this mult-glht(lm(Resp1~Treat1, data=statdata, na.action = na.omit), linfct=mcp(Treat1=Tukey)) multp-summary(mult) tabs=format(round(multp$test$pvalues, 6), nsmall=6, scientific=FALSE) print (tabs) i=i+1 } Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Tukey-HSD-test-using-Multcomp-tp3404038p3404906.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-ME] lmm WITHOUT random factor (lme4)
On Mar 19, 2011; 01:39am Andrzej Galecki wrote: I agree with you that caution needs to be exercised. Simply because mathematically the same likelihood may be defined using different constant. Yes. But this is ensured by the implementation. If the call to anova() is made with the lm$obj first in the sequence then an error is thrown. If the call is correctly made, with the lme$obj placed first in the sequence, then the log of the likelihood of each object is calculate by nlme:::logLik.lme using the same formula [via lapply(object, logLik, REML), where logLik points to nlme:::logLik.lme]. You will note, as Andrzej Galecki has pointed out, that the logLik.lm of the lm$obj is different from logLik.lme. ## logLik(fm) 'log Lik.' -950.1465 (df=3) logLik(fm, REML=T) 'log Lik.' -946.8318 (df=3) Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/lmm-WITHOUT-random-factor-lme4-tp3384054p3389249.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmm WITHOUT random factor (lme4)
Apologies to all for the multiple posting. Don't know what caused it. Maybe it _is_ time to stop using Nabble after all... Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/lmm-WITHOUT-random-factor-lme4-tp3384054p3386833.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmm WITHOUT random factor (lme4)
On Mar 18, 2011; 10:55am Thierry Onkelinx wrote: Furthermore, I get an error when doing an anova between a lm() and a lme() model. Hi Thierry, You get this error because you have not done the comparison the way I said you should, by putting the lme$obj model first in the call to anova(). This ensures that lme's anova method is invoked, rather than lm's anova method. You really do need to be careful with this... ## From my original posting ## Compare models with and without random effects fm - lm(Reaction ~ Days, sleepstudy) fm1 - lme(Reaction ~ Days, random= ~1|Subject, sleepstudy) anova(fm1, fm) ## lme-fitted model must come first Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/lmm-WITHOUT-random-factor-lme4-tp3384054p3386810.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmm WITHOUT random factor (lme4)
On Mar 17, 2011; 11:43am Baugh wrote: Question: can I simply substitute a dummy var (e.g. populated by zeros) for ID to run the model without the random factor? when I try this R returns values that seem reasonable, but I want to be sure this is appropriate. If you can fit the model using lme (and it looks like you easily can) then another check would be: ## Compare models with and without random effects fm - lm(Reaction ~ Days, sleepstudy) fm1 - lme(Reaction ~ Days, random= ~1|Subject, sleepstudy) anova(fm1, fm) ## lme-fitted model must come first Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/lmm-WITHOUT-random-factor-lme4-tp3384054p3384072.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmm WITHOUT random factor (lme4)
On Mar 17, 2011; 04:29pm Thierry Onkelinx wrote: You cannot compare lm() with lme() because the likelihoods are not the same. Use gls() instead of lm() Hi Thierry, Of course, I stand subject to correction, but unless something dramatic has changed, you can. gls() can be used if you need to accommodate a correlation structure. The method I have outlined, i.e. anova(lme$obj, lm$obj), is detailed in Pinheiro Bates (2000) beginning page 154. Please refer to this if you doubt me. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/lmm-WITHOUT-random-factor-lme4-tp3384054p3384802.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmm WITHOUT random factor (lme4)
On Mar 17, 2011; 04:29pm Thierry Onkelinx wrote: You cannot compare lm() with lme() because the likelihoods are not the same. Use gls() instead of lm() And perhaps I should have added the following: First para on page 155 of Pinheiro Bates (2000) states, The anova method can be used to compare lme and lm objects. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/lmm-WITHOUT-random-factor-lme4-tp3384054p3384823.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem implementing 'waldtest' when using 'mlogit' package
On Mar 13, 2011; 03:44pm Gaurav Ghosh wrote: I have been working through the examples in one of the vignettes associated with the 'mlogit' package, 'Kenneth Train's exercises using the mlogit package for R.' In spite of using the code unchanged, as well as the data used in the examples, I have been unable to run a Wald test to test two models. It strikes me that you may not have given the full facts. I have no problem with this (using a development version of R). Note that you need to make H first. data(Heating, package = mlogit) H - mlogit.data(Heating, shape = wide, choice = depvar, + varying = c(3:12)) m - mlogit(depvar ~ ic + oc | 0, H) summary(m) mc - mlogit(depvar ~ ic + oc, H, reflevel=hp) mi2 - mlogit(depvar ~ oc + ic | income, H, reflevel=hp) waldtest(mc,mi2) Wald test Model 1: depvar ~ ic + oc Model 2: depvar ~ oc + ic | income Res.Df Df Chisq Pr(Chisq) 1894 2890 4 4.6456 0.3256 What does print(mc) and print(mi2) report? Mark. sessionInfo() R version 2.13.0 Under development (unstable) (2010-10-14 r53299) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] grDevices datasets stats4splines graphics utils stats methods base other attached packages: [1] mlogit_0.2-1 maxLik_1.0-0 miscTools_0.6-10 lmtest_0.9-27 zoo_1.6-4 [6] statmod_1.4.9 Formula_1.0-0 rms_3.3-0 Hmisc_3.8-3 modeltools_0.2-17 [11] mvtnorm_0.9-96survival_2.36-5 loaded via a namespace (and not attached): [1] cluster_1.13.3 coin_1.0-18colorspace_1.0-1 grid_2.13.0 lattice_0.19-17 [6] Matrix_0.999375-47 mboost_2.0-10 party_0.9-1sandwich_2.2-6 tools_2.13.0 -- View this message in context: http://r.789695.n4.nabble.com/Problem-implementing-waldtest-when-using-mlogit-package-tp3351904p3351984.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rms: getting adjusted R^2 from ols object
On Mar 09, 2011; 11:09am Mark Seto wrote: How can I extract the adjusted R^2 value from an ols object (using rms package)? library(rms) x - rnorm(10) y - x + rnorm(10) ols1 - ols(y ~ x) ## ols1$stats ols1$stats[4] Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/rms-getting-adjusted-R-2-from-ols-object-tp3343118p3343220.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change panel background color in spplot()
Marcel, Here is one way: spplot(meuse.grid, zcol = part.a, par.settings = list(panel.background=list(col=grey))) ## trellis.par.get() trellis.par.get()$panel.background Regards, Mark. On 03/05/2011 01:06 PM, Marcel J. wrote: Hi! How does one change the background color of the map-panel in spplot()? Example: library(sp) data(meuse.grid) gridded(meuse.grid) = ~x+y spplot(meuse.grid, part.a) How would I get another background-color for the map-panel (but not for the whole plot) here? Thank you! Marcel -- View this message in context: http://r.789695.n4.nabble.com/Change-panel-background-color-in-spplot-tp3336563p3336769.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mlogit.data
My previous posting seems to have got mangled. This reposts it. On Mar 01, 2011; 03:32pm gmacfarlane wrote: workdata.csv The code I posted is exactly what I am running. What you need is this data. Here is the code again. hbwmode-mlogit.data(worktrips.csv, shape=long, choice=CHOSEN, alt.var=ALTNUM) hbwmode-mlogit.data(hbwtrips, shape=long, choice=CHOSEN, alt.var=ALTNUM) You still have not done what the posting guide asks for but have expected me (or someone else) to scrutinize a large unknown data set (22003 rows). Fortunately there are other routes. Had you studied Yves Croissant's examples (?mlogit.data), which do work, you would have seen that your input or raw data have to have a particular format for mlogit.data to work. In particular, the alt.var (mode in the TravelMode data set and ALTNUM in your data set) has to go through all its levels in sequence. Yours don't (your variable has 6 levels but sometimes runs from 1 to 5, sometimes from 2 to 6, and so on). Within each run there must be only one choice. ## library(mlogit) data(TravelMode, package = AER) head(TravelMode, n= 20) individual mode choice wait vcost travel gcost income size 1 1 air no 695910070 351 2 1 train no 343137271 351 3 1 bus no 352541770 351 4 1 caryes01018030 351 5 2 air no 6458 6868 302 6 2 train no 443135484 302 7 2 bus no 532539985 302 8 2 caryes01125550 302 9 3 air no 69 115125 129 401 10 3 train no 3498892 195 401 11 3 bus no 3553882 149 401 12 3 caryes023720 101 401 13 4 air no 6449 6859 703 14 4 train no 442635479 703 15 4 bus no 532139981 703 16 4 caryes0 518032 703 17 5 air no 646014482 452 18 5 train no 443240493 452 19 5 bus no 532644994 452 20 5 caryes0 860099 452 When we look at just the relevant part of your data we have the following: hbwtrips-read.csv(E:/Downloads/workdata.csv, header=TRUE, sep=,, dec=., row.names=NULL) head(hbwtrips[, c(2:11)], n=25) HHID PERID CASE ALTNUM NUMALTS CHOSEN IVTT OVTT TVTT COST 1 2 11 1 5 1 13.38 2.00 15.38 70.63 2 2 11 2 5 0 18.38 2.00 20.38 35.32 3 2 11 3 5 0 20.38 2.00 22.38 20.18 4 2 11 4 5 0 25.90 15.20 41.10 115.64 5 2 11 5 5 0 40.50 2.00 42.50 0.00 6 3 12 1 5 0 29.92 10.00 39.92 390.81 7 3 12 2 5 0 34.92 10.00 44.92 195.40 8 3 12 3 5 0 21.92 10.00 31.92 97.97 9 3 12 4 5 1 22.96 14.20 37.16 185.00 103 12 5 5 0 58.95 10.00 68.95 0.00 115 13 1 4 1 8.60 6.00 14.60 37.76 125 13 2 4 0 13.60 6.00 19.60 18.88 135 13 3 4 0 15.60 6.00 21.60 10.79 145 13 4 4 0 16.87 21.40 38.27 105.00 156 14 1 4 0 30.60 8.50 39.10 417.32 166 14 2 4 0 35.70 8.50 44.20 208.66 176 14 3 4 0 22.70 8.50 31.20 105.54 186 14 4 4 1 24.27 9.00 33.27 193.49 198 25 2 4 1 23.04 3.00 26.04 29.95 208 25 3 4 0 25.04 3.00 28.04 17.12 218 25 4 4 0 25.04 23.50 48.54 100.00 228 25 5 4 0 34.35 3.00 37.35 0.00 238 36 2 5 0 11.14 3.50 14.64 14.00 248 36 3 5 0 13.14 3.50 16.64 8.00 258 36 4 5 1 3.95 16.24 20.19 100.00 To show you that this is so we will mock up two variables that have the characteristics described above and use them to execute the function. ## hbwtrips$CHOICEN - rep(c(rep(0,10),1), 2003) hbwtrips$ALTNUMTest - gl(11,1,22033, labels=LETTERS[1:11]) hbwtrips[1:30, c(1:11,44,45)] hbwmode - mlogit.data(hbwtrips, varying=c(8:11), shape=long, choice=CHOICEN, alt.var=ALTNUMTest) Hope that helps, Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/mlogit-data-tp3328739p3330148.html Sent from the R help mailing list archive at Nabble.com. __
Re: [R] mlogit.data
On Feb 28, 2011; 10:33pm Gregory Macfarlane wrote: It seems as though the mlogit.data command tries to reassign my row.names, and doesn't do it right. Is this accurate? How do I move forward? Take the time to do as the posting guide asks you to do (and maybe consider the possibility that you have got it wrong). On that point, Z gave some further pointedly sound advice. Hope that helps, Mark. -- View this message in context: http://r.789695.n4.nabble.com/mlogit-data-tp3328739p3328865.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Set size of plot: in pdf() or par() ?
On Feb 23, 2011; 03:32pm Matthieu Stigler wrote: I want to have a rectangular plot of size 0.5*0.3 inches. I am having surprisingly a difficult time to do it... ...snip... If I specifz this size in pdf(), I get an error... pdf(try.pdf, height=0.3, width=0.5) par(pin=c(0.5, 0.3),mai=rep(0.1,4), omi=rep(0.01,4), mar=c(5,4,4,2)) plot(runif(100)) Error in plot.new() : figure margins too large You are specifying the margins twice, using different units, namely mai (in inches) and mar (in lines). The latter is throwing the error. To get what you want, try the following: ## pdf(paper=special, file=try.pdf, height=0.5, width=0.3) par(pin=c(0.5, 0.3), mai=rep(0.1,4), omi=rep(0.01,4)) plot(runif(100)) dev.off() My PDF viewer (Adobe) tells me that the page size is 0.29 x 0.50 inch. Regards, Mark. dev.off() -- View this message in context: http://r.789695.n4.nabble.com/Set-size-of-plot-in-pdf-or-par-tp3319094p3321034.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NPMC - replacement has 0 rows (multiple comparisons)
On 2011-02-20 20:02, Karmatose wrote: I'm trying to include multiple variables in a non-parametric analysis (hah!). So far what I've managed to figure out is that the NPMC package from CRAN MIGHT be able to do what I need... Also look at packages nparcomp and coin (+ multcomp). Both use the formula interface. For the coin+multcomp combo look at the example at the bottom of the help page for kruskal_test. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/NPMC-replacement-has-0-rows-multiple-comparisons-tp3316788p3317629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: multivariate regression
Deniz, There are 3 F statistics, R2 and p-values. But I want just one R2 and pvalue for my multivariate regression model. Which is as it should. Maybe the following will help, but we are making the dependent variables the independent variables, which may or may not be what you really have in mind. (Otherwise, as Uwe has said, you need to specify how this one R^2 / p-value should be defined from your point of view.) summary(lm(X~Y)) Call: lm(formula = X ~ Y) Residuals: Min 1Q Median 3Q Max -20.329 -9.770 0.271 11.167 18.986 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 65.663 31.464 2.0870.082 . Y1-4.232 4.616 -0.9170.395 Y2 6.846 4.181 1.6370.153 Y3-4.145 4.616 -0.8980.404 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 16.64 on 6 degrees of freedom Multiple R-squared: 0.3641, Adjusted R-squared: 0.04622 F-statistic: 1.145 on 3 and 6 DF, p-value: 0.4041 -- View this message in context: http://r.789695.n4.nabble.com/multivariate-regression-tp3260141p3263712.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] different results in MASS's mca and SAS's corresp
When I came to David's comment, I understood the theory, but not the numbers in his answer. I wanted to see the MASS mca answers match up with SAS, and the example did not (yet). I am inclined to write, O yea of little faith. David showed perfectly well that when the results of the two algorithms are put on the same scale (z-scored) then they are identical to 4th decimal place (except for a change of sign in the second dimension, which is of no import). Surely that is miracle-enough? Mark. -- View this message in context: http://r.789695.n4.nabble.com/different-results-in-MASS-s-mca-and-SAS-s-corresp-tp3261494p3262600.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Preparing dataset for glmnet: factors to dummies
Hi Frank, I believe that glmnet scales variables by their standard deviations. This would not be appropriate for categorical predictors. That's an excellent point, which many are likely to forget (including me) since one is using a model matrix. The default argument is to standardize inputs, but there is an option to turn it off. (One could then standardize continuous inputs on different scales oneself.) Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Preparing-dataset-for-glmnet-factors-to-dummies-tp3250791p3253538.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Factor rotation (e.g., oblimin, varimax) and PCA
Finn, But when I use 'principal' I do not seem to be able to get the same results from prcomp and princomp and a 'raw' use of eigen: ...snip... So what is wrong with the rotations and what is wrong with 'principal'? I would say that nothing is wrong. Right at the top of the help file for principal() [3rd line down] Professor Revelle notes that The eigen vectors are rescaled by the sqrt of the eigen values to produce the component loadings more typical in factor analysis. You talk about differences and results not quite corresponding. But what actually are these differences? and what are their magnitudes? In cases like this, where you are questioning well-tested functions you would be well advised to give concrete examples, accompanied by code that users can run without having to grub around your questions. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Factor-rotation-e-g-oblimin-varimax-and-PCA-tp3239099p3241553.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with factor analysis
Does anyone know what I am doing wrong? Could be a lot or could be a little, but we have to guess, because you haven't given us the important information. That you are following Crawley is of little or no interest. We need to know what _you_ did. What is model and what's in it? ## str(model) attributes(model) If you fitted your model using factanal then loadings(model)[,1] will fail with the following error message ## loadings(factanal(m1, factors=3)[,1]) Error in factanal(m1, factors = 3)[, 1] : incorrect number of dimensions Even if you did not see such a message it seems likely that model is in the wrong format for loadings to extract anything useful from it. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-factor-analysis-tp3234117p3234334.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discriminant Correspondence Analysis
Wayne, I don't know how to assign a name for the df, or what to put for fac, and what is worse, I get an error message saying that the program cannot find the discrimin.coa command. Before you can use a package you have downloaded you need to activate it. There are different ways of doing this. Simplest is to type library(ade4). ## library(ade4) ?discrimin.coa Follow Bastiaan and read in your file as follows (single forward slashes also work): ## See ?read.csv as you may need to change some switches MyFile - read.csv(C:\\Documents and Settings\\USER\\My Documents\\MyFile.csv) str(MyFile) Without data it is difficult to help you further, but your general call to discrimin.coa is ## This may or may not work; depends what's in MyFile T.discrimin - discrimin.coa(MyFile, fac = someFacInMyFile, scann=F, nf=4) T.discrimin plot(T.discrimin) Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Discriminant-Correspondence-Analysis-tp3087929p3088091.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove quotes from the paste output
Bob, Does anybody know how to eliminate the double quotes so that I can use the variable name (generated with the paste function) further in the code... ?noquote should do it. ## varName [1] varName noquote(varName) [1] varName Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/remove-quotes-from-the-paste-output-tp3083692p3084037.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey Test, lme, error: less than two groups
Lilith, No the big mystery is the Tukey test. I just can't find the mistake, it keeps telling me, that there are less than two groups ... ### Tukey test ## summary(glht(PAM.lme, linfct = mcp(Provenancef = Tukey))) Error message: Fehler in glht.matrix(model = list(modelStruct = list(reStruct = list(Code .808654423456211, : ncol(linfct) is not equal to length(coef(model)) You need to look, in particular, at what's in your variable Provenancef. ## PAMdata$Provenancef levels(PAMdata$Provenancef) And if the call to glht() is returning an object you need to inspect obj$linfct. It contains the contrast matrix (for Tukey contrasts) and needs to match the number of coefficients in your model. Try the following for further clues: glht(PAM.lme, linfct = mcp(Provenancef = Tukey))$linfct coef(PAMaov) Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Tukey-Test-lme-error-less-than-two-groups-tp3069789p3071678.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pca analysis: extract rotated scores?
Hi Liviu, However, I'm still confused on how to compute the scores when rotations (such as 'varimax' or other methods in GPArotation) are applied. PCA does an orthogonal rotation of the coordinate system (axes) and further rotation is not usually done (in contrast to factor analysis). Neither prcomp nor princomp do any further rotation and any rotate= argument in the call is simply being passed through. You can get what you want from by feeding the scores from prcomp/princomp (or something else) to GPArotation. This returns rotated scores and the rotating matrix. ## library(GPArotation) .PC - princomp(~am+carb+cyl+disp+drat+gear+hp+mpg, cor=TRUE, data=mtcars) .PCs - .PC$scores .PCrs - entropy(.PCs) .PCs .PCrs Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/pca-analysis-extract-rotated-scores-tp3065061p3066795.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pca analysis: extract rotated scores?
Hi He Zhang, Is the following right for extracting the scores? ... pca$loadings pca$score Yes. But you should be aware that the function principal() in package psych is standardizing your data internally, which you might not want. That is, the analysis is being based on the correlation matrix rather than on the covariance matrix. The two analyses (and the default biplots that issue from them) have somewhat different properties. You should know about these. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/pca-analysis-extract-rotated-scores-tp3065061p3067370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compute polychoric matrix for more than 2 variables
Hi Raquel, routine in R to compute polychoric matrix to more than 2 categorical variables? I tried polycor package, but it seems to be suited only to 2-dimensional problems. Bur surely ?hetcor (in package polycor) does it. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Compute-polychoric-matrix-for-more-than-2-variables-tp3064941p3065027.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wilcoxon Rank Sum in R with a multiple testing correction
Hi Selthy, I'd like to use a Wilcoxon Rank Sum test to compare two populations of values. Further, I'd like to do this simultaneously for 114 sets of values. Well, you read your data set into R using: ## ?read.table ?read.csv There are other ways to bring in data. Save the import to a workspace object at the same time: myDat - read.csv (...) Do the Wilcoxon Rank Sum tests using the implementation of your choice (there are several): ## See the examples at foot of help page. Lacking data we will make some. ?wilcox.test pv1 - wilcox.test(rnorm(10), rnorm(10, 2), conf.int = TRUE)$p.value pv2 - wilcox.test(rnorm(10), rnorm(10, 2), conf.int = TRUE)$p.value pv3 - wilcox.test(rnorm(10), rnorm(10, 2), conf.int = TRUE)$p.value Eventually you will discover more elegant ways of assembling a vector (or some other type of storage object). Finally, you feed your p-values to: ## ?p.adjust pAdj - p.adjust (c(pv1, pv2, pv3), method = c(BH)) ## ?round ?sprintf cbind.data.frame (Uncorrected = c(pv1, pv2, pv3), BH_Corrected = pAdj) Eventually you will discover how to turn all of this into an elegant function. I really do hope that this is not a school assignment. If so Well, you still need to do some work to get this going. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Wilcoxon-Rank-Sum-in-R-with-a-multiple-testing-correction-tp3056557p3056878.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Factor analysis and cfa with asymptotically distributed data
Jane, Does someone know how to do fa and cfa with strong skewed data? Your best option might be to use a robustly estimated covariance matrix as input (see packages robust/robustbase). Or you could turn to packages FAiR or lavaan (maybe also OpenMx). Or you could try soft modelling via package plspm. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Factor-analysis-and-cfa-with-asymptotically-distributed-data-tp3056387p3056944.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] changing the limits of a secondary y-axis in a barplot
Hi Anna, How can I change the barplot so that the left hand axis scales from 0 to 15 and the right hand axis from 0 to 5? Try this: par(mfrow=c(1,1), mai=c(1.0,1.0,1.0,1.0)) Plot1-barplot(rbind(Y1,Y2), beside=T, axes=T, names.arg=c(a,b), ylim=c(0,15), xlim=c(1,9), space=c(0,1), col=c(darkgray,white), yaxt=n) Plot2-barplot(Y3, add=T, beside=T, names.arg=c, col=c(darkgray,white), ylim=c(0,5), space=c(0,7), width=1, yaxt=n) axis(side=2, at=seq(0,15,3), labels=seq(0,15,3)) axis(side=4, at=seq(0,15,3), labels=seq(0,5,1)) Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/changing-the-limits-of-a-secondary-y-axis-in-a-barplot-tp3046117p3046283.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA table and lmer
Hi Jim, The decomposition of the sum of squares should be the same regardless of whether block is treated as random of fixed. Should it? By whose reckoning? The models you are comparing are different. Simple consideration of the terms listed in the (standard) ANOVA output shows that this is so, so how could the sum-of-squares be the same? I noticed that other people have asked similar questions in the past, but I haven't seen a satisfactory explanation. Maybe, but it has been answered (by me, and surely by others). However, canonical would be Venables and Ripley's MASS (: 283--286). The models you need to compare are the following: ## Aov.mod - aov(Y ~ V * N + Error(B/V/N), data = oats) Lme.mod - lme(Y ~ V * N, random = ~1 | B/V/N, data = oats) Lmer.mod - lmer(Y~ V * N +(1|B)+(1|B:V)+(1|B:N), data = oats) summary(Aov.mod) anova(Lme.mod) anova(Lmer.mod) HTH, Mark Difford. -- View this message in context: http://r.789695.n4.nabble.com/ANOVA-table-and-lmer-tp3027546p3027662.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting all contrasts from glm
Jim, In the glm object I can find the contrasts of the main treats vs the first i.e. 2v1, 3v1 and 4v1 ... however I would like to get the complete set including 3v2, 4v2, and 4v3 ... along with the Std. Errors of all contrasts. Your best all round approach would be to use the multcomp package. In particular, look at ?glht ?summary.glht ?contrMat Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/getting-all-contrasts-from-glm-tp3007027p3007421.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating publication-quality plots for use in Microsoft Word
I'd prefer to stick with JPEG, TIFF, PNG, or the like. I'm not sure EPS would fly. Preferring to stick with bitmap formats (like JPEG, TIFF, PNG) is likely to give you the jagged lines and other distortions you profess to want to avoid. EPS (encapsulated postscript, which handles vector+bitmap) is one of the graphic file formats preferred by most quality journals. Surprisingly, not too many people seem to be aware of the fact that PDF really is a crippled form of postscript. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Creating-publication-quality-plots-for-use-in-Microsoft-Word-tp2540676p2540858.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] path analysis
Guy, For a partial least squares approach look at packages plspm and pathmox. Also look at sem.additions. Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/path-analysis-tp2528558p2530207.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coinertia randtest
Hi Petar, I dunno why, but I cannot make randtes[t].coinertia() from ade4 package working. I have two nice distance matrices (Euclidean): Could anyone help with this? Yes (sort of). The test has not yet been implemented for dudi.pco, as the message at the end of your listing tells you. The result is: Error in randtest.coinertia(coi, nrepet = 1000) : Not yet available So far it has only been implemented for some types of dudi.pca, and for dudi.coa, dudi.fca, and dudi.acm. You might be lucky and find code to do what you want on the ade4 list. http://pbil.univ-lyon1.fr/ADE-4/home.php?lang=eng http://listes.univ-lyon1.fr/wws/info/adelist Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Coinertia-randtest-tp2335696p2335748.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading lmer table
Hi Nicola, In few word: does this row indicate a global effect of the predictor 'cat' or a more specific passage? It indicates a more specific passage. Use anova(m7) for global/omnibus. Check this for yourself by fitting the model with different contrasts. The default contrasts in R are treatment contrasts. ## m7 - lmer(log.second ~ Cond + cat + (1|subjID) + (1|Code), data = march.f, contrasts=list(Cond=contr.treatment, cat=contr.treatment)) m7s - lmer(log.second ~ Cond + cat + (1|subjID) + (1|Code), data = march.f, contrasts=list(Cond=contr.sum, cat=contr.sum)) summary(m7) summary(m7s) anova(m7) anova(m7s) Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/reading-lmer-table-tp2329521p2330809.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Fwd: Re: Plotting log-axis with the exponential base to a plot with the default logarithm base 10]
Elisabeth, You should listen to Ted (Harding). He answered your question with: the vertical axis is scaled logarithmically with the numerical annotations corresponding to the *raw* values of Y, not to their log-transformed values. Therefore it does not matter what base of logarithms is used... And he has tried to help you further by asking you to answer the following question: How is it that you recognise that it is a logaritmic axis with the base of 10, as opposed to any other base? Maybe a little graphic demonstration (run the script below) will help to convince you. The top row of the figure re-poses Ted's question, viz How do you know which logarithmic base was used to plot which sub-figure? You will see from my script that the first is plotted using natural logs, whereas the next two use logs to base 10. (The last of them uses R's log=y argument.) What's the difference? The first two sub-figures on the bottom row show the scales that were hidden in the sub-figures above them (from the script you will see that all I did was turn off the annotation/labelling of the scale). The last sub-figure on the bottom row is identical in __appearance__ to that above it. However, as you will see from my script, it plots data that have been transformed to natural logarithms. For scale-annotation, however, I have used raw data values. That was Ted's first point: it doesn't matter which base of logarithm your data have been transformed to if you annotate the scale using (backtransformed) raw values. ## Show that log=y does the job you want done even if internally it uses logs to the base 10 ## rather than natural logarithms par(mfrow=c(2,3)) plot(log(1:10), yaxt=n, ylab=) plot(log10(1:10), yaxt=n, ylab=) plot((1:10), log=y) plot(log(1:10)) plot(log10(1:10)) plot(log(1:10), yaxt=n) axis(side=2, at=sapply(c(1,2,5,10), FUN=function(x) log(x)), labels=c(1,2,5,10)) Regards, Mark -- View this message in context: http://r.789695.n4.nabble.com/Fwd-Re-Plotting-log-axis-with-the-exponential-base-to-a-plot-with-the-default-logarithm-base-10-tp2172435p2173481.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Fwd: Re: Plotting log-axis with the exponential base to
Hi All, You can also add a line using lines() if you transform in the call using the same log-base---but not via R's log=y argument (because of what's stored in par(yaxp)). ## par(mfrow=c(1,3)) plot(1:10, log=y) lines(log10(1:10)) par(yaxp) plot(log10(1:10), yaxt=n) axis(side=2, at=sapply(c(1,2,5,10), FUN=function(x) log10(x)), labels=c(1,2,5,10)) lines(log10(1:10)) par(yaxp) plot(log(1:10), yaxt=n) axis(side=2, at=sapply(c(1,2,5,10), FUN=function(x) log(x)), labels=c(1,2,5,10)) lines(log(1:10)) par(yaxp) Regards, Mark. -- View this message in context: http://r.789695.n4.nabble.com/Fwd-Re-Plotting-log-axis-with-the-exponential-base-to-a-plot-with-the-default-logarithm-base-10-tp2172435p2182338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic and Linear Regression Libraries
Hi Phil, So far for logistic regression I've tried glm(MASS) and lrm (Design) and found there is a big difference. Be sure that you mean what you say, that you are saying what you mean, and that you know what you mean when making such statements, especially on this list. glm is not in MASS, so perhaps you mean polr in package MASS. And no, there is no big difference. You are doing something wrong. ## Edited output from polr and lrm house.lrm - lrm(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) house.lrm Logistic Regression Model lrm(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq) snip CoefS.E.Wald Z P y=Medium 0.4961 0.12485 3.97 0.0001 y=High-0.6907 0.12547 -5.50 0. Infl=Medium 0.5664 0.10465 5.41 0. Infl=High 1.2888 0.12716 10.14 0. Type=Apartment -0.5724 0.11924 -4.80 0. Type=Atrium-0.3662 0.15517 -2.36 0.0183 Type=Terrace -1.0910 0.15149 -7.20 0. Cont=High 0.3603 0.09554 3.77 0.0002 house.plr - polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) summary(house.plr) Re-fitting to get Hessian Call: polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq) Coefficients: Value Std. Error t value InflMedium 0.5663937 0.10465285 5.412120 InflHigh 1.2888191 0.12715641 10.135699 TypeApartment -0.5723501 0.11923824 -4.800055 TypeAtrium-0.3661866 0.15517362 -2.359851 TypeTerrace -1.0910149 0.15148617 -7.202076 ContHigh 0.3602841 0.09553587 3.771192 snip Regards, Mark. tdm wrote: Hi all, I'm trying to discover the options available to me for logistic and linear regression. I'm doing some tests on a dataset and want to see how different flavours of the algorithms cope. So far for logistic regression I've tried glm(MASS) and lrm (Design) and found there is a big difference. Is there a list anywhere detailing the options available which details the specific algorithms used? Thanks in advance, Phil -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26140375.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a publication quality table for summary.zeroinfl()
Hi Chris, My ideal would be to gather the information onto the clipboard so I could paste it into Excel and do the formatting there, but any approach would be better than what I have now. I would never use Excel for this since there are far superior tools available. But it is very easy to assemble the two parts into a single table for further manipulation. Not sure about the clipboard route, but the following rather crude approach saves the object (a list) to a *.csv file in your working directory. ## Simple, crude approach, with theta replicated as last column in the saved *.csv data(bioChemists, package = pscl) fm_zinb - zeroinfl(art ~ . | 1, data = bioChemists, dist = negbin) Temptab - list(coef=rbind(summary(fm_zinb)$coefficients[[1]], summary(fm_zinb)$coefficients[[2]]), theta=fm_zinb$theta) getwd() write.csv(Temptab, file=Temptab.csv) read.csv(Temptab.csv) ## If you have a latex distribution library(Hmisc) latex(list(coef=rbind(summary(fm_zinb)$coefficients[[1]], summary(fm_zinb)$coefficients[[2]]), theta=fm_zinb$theta), cdec=c(3,3,3,4,3)) Regards, Mark. Chris Fowler wrote: I will swallow my pride and post to this list for the second time in 24 hours--I have a paper due to a reviewer and I am desperate. Has anybody written code to move the output from summary() called on the results of a zeroinfl() model (from the pscl package) into a form suitable for publication? When I hit send on this message I am going to begin hand typing stars into a spreadsheet. The problem is that the zero-inflated model has two parts: a count and a zero portion--its coefficients are stored in separate arrays and there is a Log(theta) that needs to be thrown in there that is in a completely separate place in the structure of the summary function. As a result the functions that I have found for outputting summaries of other linear models all give error messages when attempted on this summary object. My ideal would be to gather the information onto the clipboard so I could paste it into Excel and do the formatting there, but any approach would be better than what I have now. Thanks, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/a-publication-quality-table-for-summary.zeroinfl%28%29-tp26140199p26140542.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic and Linear Regression Libraries
tdm wrote: OK, I think I've figured it out, the predict of lrm didn't seem to pass it through the logistic function. If I do this then the value is similar to that of lm. Is this by design? Why would it be so? Please take some time to read the help files on these functions so that you at least understand that they fit different models. ?Design:::lrm ?Design:::predict.lrm ?Design:::datadist ?lm This really is just a plainer way of saying what I said before. HTH, Mark. tdm wrote: OK, I think I've figured it out, the predict of lrm didn't seem to pass it through the logistic function. If I do this then the value is similar to that of lm. Is this by design? Why would it be so? 1 / (1 + Exp(-1 * 3.38)) = 0.967 tdm wrote: Anyway, do you know why the lrm predict give me a values of 3.38? -- View this message in context: http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26141711.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lost all script
Hi David, Now when I turn on R again the script is now completely blank. This happened to me about 4--5 months ago under Vista. I cannot quite remember what I did but I think I got the script working by opening it in another editor (a hex editor would do) and removing either the first few bytes or the last few bytes. If you still have the file try this route. Good luck, Mark. David Young-18 wrote: Hi all, I just had a rather unpleasant experience. After considerable work I finally got a script working and set it to run. It had some memory allocation problems when I came back so I used Windows to stop it. During that process it told me that the script had been changed and asked if I wanted to save it. Not being positive that I'd saved the very last changes I said yes. Now when I turn on R again the script is now completely blank. I guess my questions are: Is there a way to interrupt a program without using Windows? Is there anyway to recover my script? And a nice to know: Anybody know why it saved blank space as the new script? Thanks for any advice. A humble, and humbled, new R user. -- Best regards, David Young Marketing and Statistical Consultant Madrid, Spain +34 913 540 381 http://www.linkedin.com/in/europedavidyoung mailto:dyo...@telefonica.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Lost-all-script-tp26096908p26101731.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing data and LME models and diagnostic plots
Peter Flom wrote: I am puzzled by the performance of LME in situations where there are missing data. As I understand it, one of the strengths of this sort of model is how well it deals with missing data, yet lme requires nonmissing data. You are confusing missing data with an unbalanced design. A strengths of LME is that it copes with the latter, which aov() does not. Regards, Mark. Peter Flom-2 wrote: Hello Running R2.9.2 on Windows XP I am puzzled by the performance of LME in situations where there are missing data. As I understand it, one of the strengths of this sort of model is how well it deals with missing data, yet lme requires nonmissing data. Thus, m1.mod1 - lme(fixed = math_1 ~ I(year-2007.5)*TFC_, data = long, random = ~I(year-2007.5)|schoolnum) causes an error in na.fail.default, but adding na.action = na.omit makes a model with no errors. However, if I create that model, i.e. m1.mod1 - lme(fixed = math_1 ~ I(year-2007.5)*TFC_, data = long, random = ~I(year-2007.5)|schoolnum, na.action = na.omit) then the diagnostic plots suggested in Pinheiro Bates produce errors; e.g. plot(m1.mod1, schoolnum~resid(.), abline = 0) gives an error could not find function NaAct. Searching the archives showed a similar question from 2007, but did not show any responses. Thanks for any help Peter ) Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Missing-data-and-LME-models-and-diagnostic-plots-tp25996584p25998546.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing data and LME models and diagnostic plots
Hi Peter, See e.g. Hedeker and Gibbons, Longitudinal Data Analysis, which repeatedly stresses that mixed models provide good estimates if the data are missing at random. This may be true. However, one of the real strengths of LME is that it handles unbalanced designs, which is a different thing. The fact that it also gets good estimates when data are missing is a bonus. Regards, Mark. Peter Flom-2 wrote: I wrote I am puzzled by the performance of LME in situations where there are missing data. As I understand it, one of the strengths of this sort of model is how well it deals with missing data, yet lme requires nonmissing data. Mark Difford replied You are confusing missing data with an unbalanced design. A strengths of LME is that it copes with the latter, which aov() does not. Thanks for your reply, but I don't believe I am confused with respect to missing data in mixed models. See e.g. Hedeker and Gibbons, Longitudinal Data Analysis, which repeatedly stresses that mixed models provide good estimates if the data are missing at random. Regards Peter Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Missing-data-and-LME-models-and-diagnostic-plots-tp25996584p26004465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re posting various problems with two-way anova, lme, etc.
Hi Michael, How do you control what is the (intercept) in the model returned by the lme function and is there a way to still be able to refer to all groups and timepoints in there without referring to intercept? Here is some general help. The intercept is controlled by the contrasts that you used when fitting your model. By default these are treatment (i.e. Dunnett) contrasts, but you can change this. ## ?options options(contrasts) ?contrasts options(contrasts=c(contr.sum, contr.poly)) options(contrasts) You can design your own contrasts or you can use those provided by base R: ## ?contr.treatment (There is also contr.sdif in MASS, which is a very useful type for environmental studies) See http://users.monash.edu.au/~murray/stats/BIO4200/LinearModels.pdf, which covers the main ready-made types in R. For ANOVA, sum-to-zero contrasts are commonly used. If you do use treatment contrasts you can (usually) change your reference level on-the-fly or you can set it when you make your factor. ## ?factor ?levels ?reorder ?relevel If you are using the multcomp package then look at: ?contrMat An example of how to use it is given under the glht function. Regards, Mark. Michael Schacht Hansen wrote: Hi, I posted the message below last week, but no answers, so I'm giving it another attempt in case somebody who would be able to help might have missed it and it has now dropped off the end of the list of mails. I am fairly new to R and still trying to figure out how it all works, and I have run into a few issues. I apologize in advance if my questions are a bit basic, I'm also no statistics wizard, so part of my problem my be a more fundamental lack of knowledge in the field. I have a dataset that looks something like this: Week Subj Group Readout 01 A 352.2 11 A 366.6 21 A 377.3 31 A 389.8 41 A 392.5 02 B 344.7 12 B 360.9 . . . . So basically I have a number of subjects that are divided into different groups receiving different interventions/treatments. Observations on these subjects are made on 5 occasions (Week 0-4). I would like to see if there is difference between the treatment groups and if the observations that we are making change in the individual groups over time. According to my very limited statistics training that means that I would have to do a two-way anova with repeated measures because the same subjects are observed on the different weeks. Now in R I can do something like this: MyData$fWeek - factor(MyData$Week) #Convert week to factor m1 - aov(Readout ~ Group*fWeek + Error(Subj/fWeek), data=MyData) My first naive question is: Is the Error(...) term correct for the design that I describe? I am not quite sure if I understand the syntax correctly. In any case, it actually seems to work fine, but I can't figure out how to do post hoc testing on this. TukeyHSD does not work for the aovlist which is returned, so I am kind of stuck. Is there a simple way to do the post hoc test based on the aovlist? I have been reading various questions on the list and I think that I have understood that I should be using lme from the nlme package and this is where I run into some problems. As I understand it, the lme command that I would need would look something like this: m1 - lme(Readout ~ Group*fWeek,random=~1|Subj/fWeek, data=MyData) Now this actually fails with an error message like: Error in MEEM(object, conLin, control$niterEM) : Singularity in backsolve at level 0, block 1 The reason (I believe) is that I have some weeks where there are no observations for a specific group. In this case, one of the groups had a lot of drop-out and at Week 4, there are no subjects left in one of the groups and that seems to be causing the problem. I can get it to run by excluding the last week with something like: m1 - lme(Readout ~ Group*fWeek,random=~1|Subj/fWeek, data=MyData[which(MyData$Week 4), ]) My next question is: Is there another way around that? I would still like to run the analysis for the remaining groups on the last time point and I believe that it should somehow be included into the entire analysis. I have also managed to find a few postings with similar problems, but no real solutions, so anything helpful comments would be much appreciated. My final issue is how do I do the post hoc testing on the model. I understand (I think) that I should use the glht function from the multcomp package. For Instance, I could do something like: summary(glht(m1,linfct=c(GroupB:fWeek1 - GroupC:fWeek1 = 0,GroupB:fWeek2 - GroupC:fWeek2 = 0))) And that actually works fine. My problem is that Group A and fWeek 0 seems to have turned into the (intercept), so I can't write something like: summary(glht(m1,linfct=c(GroupB:fWeek0 - GroupB:fWeek1 = 0))) To check for changes between baseline and week 1 in Group B
Re: [R] What is the difference between prcomp and princomp?
Peng Yu wrote: Some webpage has described prcomp and princomp, but I am still not quite sure what the major difference between them is. The main difference, which could be extracted from the information given in the help files, is that prcomp uses the singular value decomposition [i.e. does not rely eigenanalysis], which is generally the preferred method for numerical accuracy. There is plenty of information on the web about the differences between R- and Q-mode PCA. Regards, Mark. Peng Yu wrote: Some webpage has described prcomp and princomp, but I am still not quite sure what the major difference between them is. Can they be used interchangeably? In help, it says 'princomp' only handles so-called R-mode PCA, that is feature extraction of variables. If a data matrix is supplied (possibly via a formula) it is required that there are at least as many units as variables. For Q-mode PCA use 'prcomp'. What are R-mode and Q-mode? Are they just too different numerical methods to compute PCA? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/What-is-the-difference-between-prcomp-and-princomp--tp25952965p25955831.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trendline for a subset of data
Hi Steve, However, I am finding that ... the trendline ... continues to run beyond this data segment and continues until it intersects the vertical axes at each side of the plot. Your best option is probably Prof. Fox's reg.line function in package car. ## library(car) ?reg.line reg.line Regards, Mark. smurray444 wrote: Dear all, I am using abline(lm ...) to insert a linear trendline through a portion of my data (e.g. dataset[,36:45]). However, I am finding that whilst the trendline is correctly displayed and representative of the data portion I've chosen, the line continues to run beyond this data segment and continues until it intersects the vertical axes at each side of the plot. How do I display the line so that it only runs between point 36 and 45 (as shown in the example above) as doesn't continue to display a line throughout the rest of the plot space? Many thanks, Steve _ View your other email accounts from your Hotmail inbox. Add them now. http://clk.atdmt.com/UKM/go/167688463/direct/01/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Trendline-for-a-subset-of-data-tp25818425p25821972.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with html() in Hmisc
Hi Liviu, tmp - latex(.object, cdec=c(2,2), title=) class(tmp) [1] latex html(tmp) /tmp/RtmprfPwzw/file7e72f7a7.tex:9: Warning: Command not found: \tabularnewline Giving up command: \...@hevea@amper /tmp/RtmprfPwzw/file7e72f7a7.tex:11: Error while reading LaTeX: This array/tabular column has no specification This has nothing to do with Hmisc or hevea. In the header/preample of all LyX files you will, for instance, find the following: ## - %% LyX 1.6.2 created this file. For more info, see http://www.lyx.org/. %% Do not edit unless you really know what you are doing. \documentclass[english]{article} . . . %% LyX specific LaTeX commands. %% Because html converters don't know tabularnewline \providecommand{\tabularnewline}{\\} Regards, Mark. Liviu Andronic wrote: Dear all On my system html() conversion of a `latex()' object fails. Follows a dummy example: require(Hmisc) data(Angell) .object - cor(Angell[,1:2], use=complete.obs) tmp - latex(.object, cdec=c(2,2), title=) class(tmp) [1] latex html(tmp) /tmp/RtmprfPwzw/file7e72f7a7.tex:9: Warning: Command not found: \tabularnewline Giving up command: \...@hevea@amper /tmp/RtmprfPwzw/file7e72f7a7.tex:11: Error while reading LaTeX: This array/tabular column has no specification Adios I have hevea-1.10 installed on Debian (according to the help page, it performs the conversion). Attached is teh produced .tex file. Is this a bug or would there be a way to work around this behaviour? Thank you Liviu sessionInfo() R version 2.9.2 (2009-08-24) x86_64-pc-linux-gnu locale: LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] datasets grDevices tcltk splines graphics utils stats [8] methods base other attached packages: [1] fortunes_1.3-6 RcmdrPlugin.Export_0.3-0 Rcmdr_1.5-2 [4] car_1.2-15 hints_1.0.1-1boot_1.2-39 [7] relimp_1.0-1 xtable_1.5-5 Hmisc_3.7-0 [10] survival_2.35-7 loaded via a namespace (and not attached): [1] cluster_1.12.0 grid_2.9.2 lattice_0.17-25 tools_2.9.2 system(uname -a) Linux debian-liv 2.6.30-1-amd64 #1 SMP Sat Aug 15 18:09:19 UTC 2009 x86_64 GNU/Linux -- Do you know how to read? http://www.alienetworks.com/srtest.cfm Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/trouble-with-html%28%29-in-Hmisc-tp25721612p25728656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with html() in Hmisc
Liviu, Does this mean that Hmisc produces LyX code (as opposed to LaTeX code)? No. Sorry for confusing you: it means that html does not know what tabularnewline means, it cannot interpret it. My reply showed where the problem lies and how to fix it. You need to add the following command to the preamble of the *.tex file: \providecommand{\tabularnewline}{\\} Regards, Mark. Liviu Andronic wrote: Hello On 10/3/09, Mark Difford mark_diff...@yahoo.co.uk wrote: This has nothing to do with Hmisc or hevea. Although I have LyX installed, I don't quite understand where LyX comes into play. The R code in the original e-mail takes a table-like object and transforms it into LaTeX; then html() calls hevea to conver .tex to .html and opens a browser to display the result. %% LyX specific LaTeX commands. %% Because html converters don't know tabularnewline \providecommand{\tabularnewline}{\\} If I print the `latex' object from the previous e-mail, I get the offending tabularnewline. data(Angell) .object - cor(Angell[,1:2], use=complete.obs) tmp - latex(.object, cdec=c(2,2), file=, title=) % latex.default(.object, cdec = c(2, 2), file = , title = ) % \begin{table}[!tbp] \begin{center} \begin{tabular}{lrr}\hline\hline \multicolumn{1}{l}{}\multicolumn{1}{c}{moral}\multicolumn{1}{c}{hetero}\tabularnewline \hline moral$ 1.00$$-0.59$\tabularnewline hetero$-0.59$$ 1.00$\tabularnewline \hline \end{tabular} \end{center} \end{table} Does this mean that Hmisc produces LyX code (as opposed to LaTeX code)? Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/trouble-with-html%28%29-in-Hmisc-tp25721612p25731240.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PCA or CA
Hi Paul, I have a data set for which PCA based between group analysis (BGA) gives significant results but CA-BGA does not. I am having difficulty finding a reliable method for deciding which ordination technique is most appropriate. Reliability really comes down to you thinking about and properly defining what _information_ you want to extract from your data set, which we know nothing about. PCA and CA are fundamentally different. The classical use of CA lies in the analysis of count-data (contingency tables), for which it remains a brilliant method. It is also widely applied to analyzing normal n x p matrices of the type usually analyzed by PCA. A doubly-centered PCA would get you close to a CA of the normal n x p matrix (i.e. of an analysis of the same matrix). This is a biggish area, so grab your specs, and perhaps start with Jolliffe (PCA) and Benzecri/Greenacre (CA). Regards, Mark. Paul Dennis-3 wrote: Dear all I have a data set for which PCA based between group analysis (BGA) gives significant results but CA-BGA does not. I am having difficulty finding a reliable method for deciding which ordination technique is most appropriate. I have been told to do a 1 table CA and if the 1st axis is2 units go for CA if not then PCA. Another approach is that described in the Canoco manual - perform DCA and then look at the length of the axes. I used decorana in vegan and it gives axis lengths. I assume that these are measured in SD units. Anyway the manual say if the axis length is 3 go for PCA,4 use CA and if intermediate use either. Are either of these approaches good/valid/recommended or is there a better method? Thanks Paul _ Get the best of MSN on your mobile http://clk.atdmt.com/UKM/go/147991039/direct/01/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/PCA-or-CA-tp25668667p25670451.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do a PCA with ordered variables?
P. Branco wrote: I have used the dudi.mix method from the ade4 package, but when I do the $index it shows me that R has considered my variables as quantitative. What should I do? You should make sure that they are encoded as ordered factors, which has nothing to do with ade4's dudi.mix(). ## ?ordered Mark. P.Branco wrote: Hi, I want to do a pca using a set of variables which are ordered. I have used the dudi.mix method from the ade4 package, but when I do the $index it shows me that R has considered my variables as quantitative. What should I do? -- View this message in context: http://www.nabble.com/How-to-do-a-PCA-with-ordered-variables--tp25491950p25491969.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hel with factorial analysis with mixed data
andreiabb wrote: the message that I am getting is Error in AFDM (all_data_sub.AFDM, type=c(rep(s,1), rep(n,1), rep(n, : unused arguments (s) (type=c(s, n,n)) Can someone help me? If you are in hel[l] then it is entirely your own fault. The error message is clear and would have become even more clear had you bothered to read/check the help file for AFDM for yourself. There is no argument type= to AFDM. ## ?AFDM I hope this will lighten your way. Mark. andreiabb wrote: Dear forum I am trying to use the package FactoMineR, since I have 2 class variables and 1 continuous variable I am trying to use AFDM, factorial analysis of mixed data. I have 3 variables, 2 qualitative and one quantitative: the code I am using is all_data_sub.AFM-all_data_sub [, c=(V4,comb,V2)] res-AFDM(all_data_AFDM,type=c(rep(s,1), rep(n,1), rep(n,1)), ncp=5, sup.var=3:3, graph=FALSE) the message that I am getting is Error in AFDM (all_data_sub.AFDM, type=c(rep(s,1), rep(n,1), rep(n, : unused arguments (s) (type=c(s, n,n)) Can someone help me? thanks -- View this message in context: http://www.nabble.com/hel-with-factorial-analysis-with-mixed-data-tp25453138p25453922.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] could not find function Varcov after upgrade of R?
Hi Zhu, could not find function Varcov after upgrade of R? Frank Harrell (author of Design) has noted in another thread that Hmisc has changed... The problem is that functions like anova.Design call a function in the _old_ Hmisc package called Varcov.default. In the new version of Hmisc this is called something else (vcov.default). The best way to fix this is to install the new (i.e. current) version of Hmisc and Frank's replacement for his Design package, which is called rms (Regression Modeling Strategies). Regards, Mark. zhu yao wrote: I uses the Design library. take this example: library(Design) n - 1000 set.seed(731) age - 50 + 12*rnorm(n) label(age) - Age sex - factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4))) cens - 15*runif(n) h - .02*exp(.04*(age-50)+.8*(sex=='Female')) dt - -log(runif(n))/h label(dt) - 'Follow-up Time' e - ifelse(dt = cens,1,0) dt - pmin(dt, cens) units(dt) - Year dd - datadist(age, sex) options(datadist='dd') Srv - Surv(dt,e) f - cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE) cox.zph(f, rank) # tests of PH anova(f) # Error in anova.Design(f) : could not find function Varcov Yao Zhu Department of Urology Fudan University Shanghai Cancer Center No. 270 Dongan Road, Shanghai, China 2009/9/12 Ronggui Huang ronggui.hu...@gmail.com I cannot reproduce the problem you mentioned. ctl - c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt - c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group - gl(2,10,20, labels=c(Ctl,Trt)) weight - c(ctl, trt) anova(lm.D9 - lm(weight ~ group)) sessionInfo() R version 2.9.2 (2009-08-24) i386-pc-mingw32 locale: LC_COLLATE=Chinese (Simplified)_People's Republic of China.936;LC_CTYPE=Chinese (Simplified)_People's Republic of China.936;LC_MONETARY=Chinese (Simplified)_People's Republic of China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_People's Republic of China.936 attached base packages: [1] stats graphics grDevices utils datasets methods base 2009/9/12 zhu yao mailzhu...@gmail.com: After upgrading R to 2.9.2, I can't use the anova() fuction. It says could not find function Varcov . What's wrong with my computer? Help needed, thanks! Yao Zhu Department of Urology Fudan University Shanghai Cancer Center No. 270 Dongan Road, Shanghai, China [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- HUANG Ronggui, Wincent Doctoral Candidate Dept of Public and Social Administration City University of Hong Kong Home page: http://asrr.r-forge.r-project.org/rghuang.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/could-not-find-function-%22Varcov%22-after-upgrade-of-R--tp25412881p25414257.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Triangular distribution for kernel regression
Hi Brian, I am trying to get fitted/estimated values using kernel regression and a triangular kernel. Look at Loader's locfit package. You are likely to be pleasantly surprised. Regards, Mark. Bryan-65 wrote: Hello, I am trying to get fitted/estimated values using kernel regression and a triangular kernel. I have found packages that easily fit values from a kernel regression (e.g. ksmooth) but do not have a triangular distribution option, and density estimators that have triangular distribution options that I can't seem to use to produce estimated values (e.g. density). Any help is appreciated. Bryan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Triangular-distribution-for-kernel-regression-tp25416706p25417878.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Alternative to Scale Function?
The scale function will return the mean and sd of the data. By default. Read ?scale. Mark. Noah Silverman-3 wrote: I think I just answered my own question. The scale function will return the mean and sd of the data. So the process is fairly simple. scale training data varaible note mean and sd from the scale then manually scale the test data using the mean and sd from the training data. That should make sure that a value is transformed the same regardless of which data set it is in. Do I have this correct, or can anybody contribute any more to the concept? Thanks! -- Noah On 9/11/09 1:10 PM, Noah Silverman wrote: Hi, Is there an alternative to the scale function where I can specify my own mean and standard deviation? I've come across an interesting issue where this would help. I'm training and testing on completely different sets of data. The testing set is smaller than the training set. Using the standard scale function of R seems to introduce some error. Since it scales data WITHIN the set, it may scale the same number to different value since the range in the training and testing set may be different. My thought was to scale the larger training set of data, then use the mean and SD of the training data to scale the testing data according to the same parameters. That way a number will transform to the same result regardless of whether it is in the training or testing set. I can't be the first one to have looked at this. Does anyone know of a function in R or if there is a scale alternative where I can control the parameters? Thanks! -- Noah __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Alternative-to-Scale-Function--tp25407625p25408289.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Omnibus test for main effects in the face ofaninteraction containing the main effects.
Hi John, When Group is entered as a factor, and the factor has two levels, the ANOVA table gives a p value for each level of the factor. This does not (normally) happen so you are doing something strange. ## From your first posting on this subject fita-lme(Post~Time+factor(Group)+factor(Group)*Time, random=~1|SS,data=blah$alldata) To begin with, what is blah$alldata? lme() asks for a data frame if the formula interface is used. This looks like part of a list, or the column of a data frame. Have a look at the output below, from a syntactically correct model that has essentially the same structure as yours. The data set should be loaded with nlme so you can run it directly to see the result. Sex, with two levels, is not split in the anova table. ## str(Orthodont) anova( lme(distance ~ age * Sex, data = Orthodont, random = ~ 1) ) Surely this is essentially what you are looking for? If Sex is not already a factor, and it really is better to make it one when you build your data set, then you can use as.factor as you have done, with the same result. (Note: age * Sex expands to age + Sex + age:Sex, which equals the messy and unnecessary age + Sex + age * Sex.) Regards, Mark. John Sorkin wrote: Daniel, When Group is entered as a factor, and the factor has two levels, the ANOVA table gives a p value for each level of the factor. What I am looking for is the omnibus p value for the factor, i.e. the test that the factor (with all its levels) improves the prediction of the outcome. You are correct that normally one could rely on the fact that the model Post-Time+as.factor(Group)+as.factor(Group)*Time contains the model Post-Time+as.factor(Group) and compare the two models using anova(model1,model2). However, my model is has a random effect, the comparison is not so easy. The REML comparions of nested random effects models is not valid when the fixed effects are not the same in the models, which is the essence of the problem in my case. In addition to the REML problem if one wants to perform an omnibus test for Group, one would want to compare nested models, one containing Group, and the other not containing group. This would suggest comparing Post-Time+ as.factor(Group)*Time to Post-Time+Group+as.factor(Group)*Time The quandry here is whether one should or not allow the first model as it is poorly specified - one term of the interaction, as.factor(Group)*Time, as.factor(Group) does not appear as a main effect - a no-no in model building. John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) jsor...@grecc.umaryland.edu Daniel Malter dan...@umd.edu 09/07/09 9:23 PM John, your question is confusing. After reading it twice, I still cannot figure out what exactly you want to compare. Your model a is the unrestricted model, and model b is a restricted version of model a (i.e., b is a hiearchically reduced version of a, or put differently, all coefficients of b are in a with a having additional coefficients). Thus, it is appropriate to compare the models (also called nested models). Comparing c with a and d with a is also appropriate for the same reason. However, note that depedent on discipline, it may be highly unconventional to fit an interaction without all direct effects of the interacted variables (the reason for this being that you may get biased estimates). What you might consider is: 1. Run an intercept only model 2. Run a model with group and time 3. Run a model with group, time, and the interaction Then compare 2 to 1, and 3 to 2. This tells you whether including more variables (hierarchically) makes your model better. HTH, Daniel On a different note, if lme fits with restricted maximum likelihood, I think I remember that you cannot compare them. You have to fit them with maximum likelihood. I am pointing this out because lmer with restricted maximum likelihood by standard, so lme might too. - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von John Sorkin Gesendet: Monday, September 07, 2009 4:00 PM An: r-help@r-project.org Betreff: [R] Omnibus test for main effects in the face of aninteraction containing the main effects. R 2.9.1 Windows XP UPDATE, Even my first suggestion anova(fita,fitb) is probably not appropriate as the fixed effects are different in the two model, so
Re: [R] Problem with xtabs(), exclude=NULL, and counting NA's
I must say that this is slightly odd behavior to require both na.action= AND exclude=. Does anyone know of a justification? Not strange at all. ?options na.action, sub head Options set in package stats. You need to override the default setting. ws-7 wrote: xtabs(~wkhp, x, exclude=NULL, na.action=na.pass) wkhp 20 30 40 45 60 NA 1 1 10 1 3 4 Thanks! I must say that this is slightly odd behavior to require both na.action= AND exclude=. Does anyone know of a justification? Shouldn't it be changed? vent Ah well, if R were internally consistent or corresponded to typical programming Unix practices, it just wouldn't feel the same ... /vent Cheers! -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Problem-with-xtabs%28%29%2C-exclude%3DNULL%2C-and-counting-NA%27s-tp25304142p25305878.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I appoint a small part of the whole data
Yichih, Answer 2 is correct, because your indexing specification for 1 is wrong. You also seem to have left out a comma. ## mu1990$wage[mu1990$edu==2|mu1990$edu==3|mu1990$edu==4, ] ## like this mu1990$wage[mu1990$edu%in%2:4, ] You really could have worked this out for yourself by looking at the results of your subsetting/indexing operation. Mark. Yichih Hsieh wrote: Dear all, I got another problem: if education have five levels edu=1 edu=2 edu=3 edu=4 edu=5 If I want to appoint y=edu2~4 in 1990 which programs is correct? I tried this two programs, they both work, but two results is different. 1. fig2b-reldist(y=mu1990$wage[mu1990$edu==2|3|4],..) 2. fig2b-reldist(y=mu1990$wage[mu1990$edu%in%2:4],..) which one is correct? and why they have different results? All help high appreciated. best, Yichih 2009/9/5 Yichih Hsieh yichih.hs...@gmail.com Dear Petr, your suggestion is useful many thanks for your help ! best, Yichih 2009/9/3 Petr PIKAL petr.pi...@precheza.cz Hi use any of suitable selection ways that are in R. E.g. data[data$gender==1, ] selects only female values data$wage[(data$gender==1) (data$race=1)] selects black female wages. and see also ?subset Regards Petr r-help-boun...@r-project.org napsal dne 03.09.2009 10:51:59: Dear all, I have 1980~1990 eleven datas, every year have three variables, wage gender(1=female, 2=male) race(1=black, 2=white) My original commands is: fig2b-reldist(y=mu1990$wage,yo=mu1980$wage,...) I have three questions: 1. If I want to appoint y=women's wage in 1990 yo=women's wage in 1980 2. If I want to appoint y=women's wage in 1990 yo=men's wage in 1990 3. If I want to appoint y=black women's wage in 1990 yo=white women's wage in 1990 How can I modify the commands? All help highly appreciated. Best, Yichih -- Yichih Hsieh e-mail : yichih.hs...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Yichih Hsieh e-mail : yichih.hs...@gmail.com -- Yichih Hsieh e-mail : yichih.hs...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/How-can-I-appoint-a-small-part-of-the-whole-data-tp25272209p25308714.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test for bimodalityIn-Reply-To=
Hi John, Has a test for bimodality been implemented in R? You may find the code at the URL below useful. It was written by Jeremy Tantrum (a PhD of Werner Stuetzle's). Amongst other things there is a function to plot the unimodal and bimodal Gaussian smoothers closest to the observed data. A dip-test statistic is also calculated. Regards, Mark. http://www.stat.washington.edu/wxs/Stat593-s03/Code/jeremy-unimodality.R John Sansom wrote: Has a test for bimodality been implemented in R? Thanks, John NIWA is the trading name of the National Institute of Water Atmospheric Research Ltd. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/test-for-bimodality-In-Reply-To%3D-tp25216164p25220627.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding factor scores back to an incomplete dataset...
Hi David, Phil, Phil Spector wrote: David - Here's the easiest way I've been able to come up with. Easiest? You are making unnecessary work for yourselves and seem not to understand the purpose of ?naresid (i.e. na.action = na.exclude). Why not take the simple route that I gave, which really is R's + factanal's route. Using Phil's data as example: ## dat = data.frame(matrix(rnorm(100),20,5)) dat[3,4] = NA dat[12,3] = NA scrs - factanal(~X1+X2+X3+X4+X5, data=dat,factors=2,scores='regression', na.action=na.exclude)$scores TrueDat - merge(dat,scrs,by=0,all.x=TRUE,sort=FALSE) TrueDat Regards, Mark. David G. Tully wrote: Thanks, Prof Spector. Your first solution works well for me. Phil Spector wrote: David - Here's the easiest way I've been able to come up with. I'll provide some sample data to make things clearer (hint, hint): dat = data.frame(matrix(rnorm(100),20,5)) dat[3,4] = NA dat[12,3] = NA scrs = factanal(na.omit(dat),factors=2,scores='regression')$scores rownames(scrs) = rownames(na.omit(dat)) newdat = merge(dat,scrs,by=0,all.x=TRUE,sort=FALSE) This will result in the observations with missing values being at the end of the data frame. If you want the original order (assuming default row names), you could use newdat[order(as.numeric(newdat$Row.names)),] A somewhat more complicated approach is, in some sense, more direct: dat$Factor1 = NA dat$Factor2 = NA dat[rownames(na.omit(dat[,-c(6,7)])),c('Factor1','Factor2')] = + factanal(na.omit(dat[,-c(6,7)]),factors=2,scores='regression')$scores The order of the data is preserved. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Tue, 25 Aug 2009, David G. Tully wrote: I am sure there is a simple way to do the following, but i haven't been able to find it. I am hoping a merciful soul on R-help could point me in the right direction. I am doing a factor analysis on survey data with missing values. to do this, I run: FA1-factanal(na.omit(DATA), factors = X, rotation = 'oblimin', scores = 'regression') Now that I have my factors and factor scores, I want to add those scores back to my original dataset so I can plot factor scores by demographics. However, when I try to add the scores back to the original data frame, the variables are of different lengths. Is there a way to subset from my original data set that will work with factanal() and preserve the original rows or that will allow me to append the factor scores back onto the original dataset with the proper rows and NAs where there could be no data? Again, I apologize if I am missing something basic. I am a self taught R user and couldn't find an answer to this question. Thanks in advance, David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/adding-factor-scores-back-to-an-incomplete-dataset...-tp25140959p25204698.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Between-group variance from ANOVA
Hi Emma, ...from this I can read the within-group variance. can anyone tell me how i may find out the between-group variance? But it's in the table, above the within-group variance. Remember that F is the ratio of these two quantities, i.e. the mean of the group variances divided by the mean of the within-group variances . I will work with my example since you never set seed so your answers are different from mine (which really does not help matters). set.seed(7) TDat - data.frame(response = c(rnorm(100, 5, 2), rnorm(100, 20, 2))) TDat$group - gl(2, 100, labels=c(A,B)) summary(aov(response ~ group, data=TDat)) 11225.25/3.64 [1] 3083.86 There is some rounding error on the mean squares (i.e. mean variances) but F is correct. Using estimates calculated by a different route we have: 11225.249057/3.639801 [1] 3084.028 Does this answer your question? Regards, Mark. emj83 wrote: I have done this in R and this is the following ANOVA table I get: summary(aov(response ~ group, data=TDat)) Df Sum Sq Mean Sq F valuePr(F) group 1 11203.5 11203.5 2505.0 2.2e-16 *** Residuals 198 885.5 4.5 The model is response(i,j)= group(i)+ error(i,j), we assume that group~N(0,P^2) and error~N(0,sigma^2) I know that sigma^2 is equal to 4.5, how do I find out P^2? In the problem that I am trying to apply this to, I have more than 2 groups. I was hoping there would be a function that helps you do this that I don't know about. Thanks for your help Emma Mark Difford wrote: Hi Emma, R gives you the tools to work this out. ## Example set.seed(7) TDat - data.frame(response = c(rnorm(100, 5, 2), rnorm(100, 20, 2))) TDat$group - gl(2, 100, labels=c(A,B)) with(TDat, boxplot(split(response, group))) summary(aov(response ~ group, data=TDat)) Regards, Mark. emj83 wrote: can anyone advise me please? emj83 wrote: I have done some ANOVA tables for some data that I have, from this I can read the within-group variance. can anyone tell me how i may find out the between-group variance? Thanks Emma -- View this message in context: http://www.nabble.com/Between-group-variance-from-ANOVA-tp24954045p25129942.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.