Re: [R] Advice on use of R for Generalised Linear Modelling
Alan, You might want to have a look to the R2STATS package on CRAN. It is a GUI for GLM and GLMM written in GTK (with the use of the nice RGtk2 and gWidgets packages by Michael Lawrence and John Verzani). Don't expect any gain in performance on large datasets though. But at least the use of GLM is quite intuitive. HTH, Yvonnick Noel Psychology and Statistics University of Britanny, Rennes France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Superpose two QQ-plots (gamma distribution) with, lattice function qqmath()
David, Duncan, Hi Following on David's rate argument try (with modifications of pch and grid) rate - 1/4 shape = 8 rate = c(rep(1/4,100),rep(1/3,100)) I don't think the problem is related to the rate argument, which can well be vectorized, as is the case for a number of arguments in distrib functions in R (note that you are redefining it as a vector above). x = rgamma(200,shape,rate) groups = gl(2,100,200,labels=LETTERS[1:2]) dat = data.frame(x=x, gp=groups) qqmath(~ x, data = dat, groups = gp, pch = 20, type = c(p,g), distribution = function(x) qgamma(x,shape,rate), panel = function(x,groups,...) { panel.qqmath(x,groups, ...) }) This works! Thank you very much (I had spent some time on that). I now try to add a QQ-line but face a new problem. Only one line appears, away from the points: qqmath(~ x, data = dat, groups = gp, pch = 20, type = c(p,g), distribution = function(x) qgamma(x,shape,rate), panel = function(x,groups,...) { panel.qqmath(x,groups, ...) panel.qqmathline(x,groups,...) }) Any hint? Thanks a lot for your help, Yvonnick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Superpose two QQ-plots (gamma distribution) with lattice function qqmath()
Hello, I am trying to superpose on a single panel two QQ plots with the lattice qqmath function. Here is a reproducible example of the problem I am facing: # Generate data shape = 8 rate = c(rep(1/4,100),rep(1/3,100)) x = rgamma(200,shape,rate) groups = gl(2,100,200,labels=LETTERS[1:2]) # Plot qqmath(~x,groups=groups,panel = panel.superpose, distribution = function(x) qgamma(x,shape,rate), panel.groups = function(x,subscripts,...) { panel.qqmath(x,shape=shape,rate=rate[subscripts],...) panel.qqmathline(x,shape=shape,rate=rate[subscripts],...) }) Both data series seem to be reproduced twice, somewhat rescaled. I don't understand what this mean. What am I doing wrong? Thanks a lot for your help, Yvonnick Noel University of Brittany, Rennes Dpt. of Psychology France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a confidence interval from an ANOVA model
Hello Erin, You might want to compute *credibility* intervals instead. They have the true meaning of intervals within which the *unknown parameter* (not the sample statistic) has a fixed probability to lie (confidence intervals do not). The AtelieR package, especially in its Bayesian inference on several means section, lets you input the data summary (means, standard deviation and counts) and automatically provides these intervals on the unknown means. The approach implemented is the one described in: Neath, A. Cavanauigh, J. (2006). A bayesian approach to the multiple comparisons problem. Journal of Data Science, 4, 131-146. Besides, it will give you the best constrained model (in terms of equal means) for your data, and the probability that this model is the true one. It is an elegant solution to the multiple comparisons problem. HTH, Yvonnick Noel University of Brittany, Rennes France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New Book: Statistical Psychology with R [in French]
Dear useRs, French reading people among you might be interested by the following book: Noel, Y. (2013). Psychologie statistique avec R [Statistical psychology with R, in French], coll. PratiqueR, Paris: Springer. http://www.springer.com/psychology/book/978-2-8178-0424-8 This book provides a detailed presentation of all basics of statistical inference for psychologists, both in a fisherian and a bayesian approach. Although many authors have recently advocated for the use of bayesian statistics in psychology (Wagenmaker et al., 2010, 2011 ; Kruschke, 2010 ; Rouder et al., 2009) statistical manuals for psychologists barely mention them. This manual provides a full bayesian toolbox for commonly encountered problems in psychology and social sciences, for comparing proportions, variances and means, and discusses the advantages. But all foundations of the frequentist approach are also provided, from data description to probability and density, through combinatorics and set algebra. A special emphasis has been put on the analysis of categorical data and contingency tables. Binomial and multinomial models with beta and Dirichlet priors are presented, and their use for making (between rows or between cells) contrasts in contingency tables is detailed on real data. An automatic search of the best model for all problem types is implemented in the AtelieR package, available on CRAN. Bayesian ANOVA is also presented, and illustrated on real data with the help of the AtelieR and R2STATS packages (a GUI for GLM and GLMM in R). In addition to classical and Bayesian inference on means, direct and Bayesian inference on effect size and standardized effects are presented. I hope you might find this book useful, Best regards, Yvonnick Noel University of Brittany, Rennes France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Idea/package to linearize a curve along the diagonal?
I am trying to normalize some data. First I fitted a principal curve (using the LCPM package), but now I would like to apply a transformation so that the curve becomes a straight diagonal line on the plot. The data used to fit the curve would then be normalized by applying the same transformation to it. It is unclear to me what you mean by diagonal but I suspect what you're looking for is to locate projected points onto the unfolded curve. That is exactly what coordinates on the principal curve would give you. Sorry if I misunderstood your point, Yvonnick Noel University of Brittany, Rennes, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GTK
I am struggling to install GTK+ for Windows 7. RGtk2 needs this package to load. Does anybody know of a installation file that works? GTK+ is automatically installed when you install the RGtk2 package (you'll be asked about it during installation). As of R-2.14.1, it is installed under the R tree, so if you had write access when installing R itself, you should have no problem. HTH, Yvonnick Noel University of Brittany, Rennes 2 France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bayesian data analysis recommendations
Hi, On the R side, you may want to have a look at the AtelieR package. It's a GTK GUI which gives you a simple interface to some common Bayesian tests (on a proportion, on a variance, on a mean, on mean and variance jointly, on several proportions, on contingency tables, on several means). There are also some automatic search procedures of the best model, when comparing several means, proportions, or rows in a contingency table. Hope this may be useful, Yvonnick Noel University of Brittany Department of Psychology Rennes, France Dear all, I am trying to learn Bayesian inference and Bayesian data analysis, I am new in the field. Would any experts on the list recommend any good sites or materials for beginners? My approach is to learn and understand the theory first, then program on my own using R, though I see there are already packages. appreciate any help, thanks in advance! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] An R interface to Model Building
Brett, Spencer, I replied to Brett on the R-SIG-GUI mailing list, suggesting to use the proto package. I found it most useful to structure the code when developing my R2STATS interface. 2. Have you reviewed the other R projects with a graphical user interface for R? Several are listed at http://sciviews.org/_rgui http://sciviews.org/_rgui/. Is this page still maintained? I wrote to Philippe Grosjean months ago but received no reply. 3. If you would like to collaborate on a project with others, r-forge.r-project.org is a standard place for hosting collaborative projects relating to R. I looked for a few of the projects listed at http://sciviews.org/_rgui http://sciviews.org/_rgui/ and couldn't find any on R-Forge. The R2STATS and the AtelieR packages are on R-Forge and provide GUIs for fitting and comparing various models (GLM and GLMM), both in a frequentist and a Bayesian approach. Best, Yvonnick Noel University of Brittany at Rennes Department of Psychology France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SPSS - R
Dear Kristi, Also. can anyone recommend any resources to help SPSS users learn to things in R? You may want to have a look at the R2STATS package, a simple GUI for linear models. Best, Yvonnick Noel University of Brittany Department of Psychology Rennes, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SPSS F-test on change in R square between hierarchical models
Christopher, I am wondering if anyone knows how to perform an F-test on the change in R square between hierarchical models in R? SPSS provides this information and a researcher that I am working with is interested in getting this information. Alternatively, if someone knows how I can calculate the test statistic (SPSS calls it F-change?) and dfs that would be helpful as well. What you describe is just the standard F test for comparing two models, or testing deviance reduction between tow *nested* models (I suspect this is what you mean by hierarchical). The anova() function will do that. The R2STATS GUI will also give you these tests, along with the R-squared, in the same table. A common misconception about an F-test is that it is the test on a variable effect, when strictly speaking it is a test on the deviance reduction between two models that include or not that particular variable (and there may be several ways to do that, each leading to possibly different F-values). Yvonnick Noel University of Brittany Department of Psychology Rennes, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New package announcement: R2STATS, a GUI for fitting GLM and GLMM
Dear R-users, I wanted to inform you that a new package called R2STATS is available, as a graphical front-end for the glm() and glmer() functions. The GUI is based on the RGTk2 and gWidgets packages by Michael Lawrence and John Verzani, and so requires that the GTK+ library be installed first on your system. This is done automagically when installing the RGtk2 package (or the script mentioned below). It also use the RGtk2Extras by Tom Taverner, to provide editable grids for data frames. This GUI is intended to provide an easy way to fit and compare GLM and GLMM models. The GLMM part is based on Douglas Bates' lme4 package and the glmer() function. Automatic plots are also drawn for every model, and you can switch from one plot to the other by just clicking on the model name. I found this feature quite useful when teaching: It helps students to get an immediate understanding of differences between models. Note that this GUI is left (deliberately) simple and is not intended to provide a full-featured GUI (please consider using Rcmdr instead for a far more advanced GUI). But it tries to do well the one and only thing it was designed to do: Fitting and comparing models. Note that most standard statistical tests may well be presented as a simple comparison between GLMs and this is the way I go with my students here. This allows an integrated presentation for almost all common (and simple) situations in social sciences. More information is available on my webpage : http://yvonnick.noel.free.fr/r2stats [in French for the moment, although the package is in English]. Installing the package is done from a temporary repository: install.packages(R2STATS,repos=http://yvonnick.noel.free.fr/cran,dep=TRUE) if you already have a recent version of GTK+ and RGtk2 installed, or by: source(http://yvonnick.noel.free.fr/r2stats/installwin.R;) for an automatic script that download and install everything. I will submit it to CRAN as soon as I have fixed some minor issues with R-devel (but the package works flawlessly with the current R-2.13.2). Any comment welcome. Also, if you are willing to contribute a translation into your language, please let me know. Best, Yvonnick Noel, PhD. University of Brittany Rennes, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hysteresis modeling and simulation
Hi Bill, I once modelled a hysteresis phenomenon (on binary data) with a simple logistic model. I am not sure I understand how this pattern appears in your data, but in my previous analyses, it appeared as an order effect: The response increased in probability later with increasing than with decreasing values of the predictor. I then simply created a binary variable for the decreasing and increasing conditions, and the coefficient on this variable was a direct and testable measure of hysteresis. In some cases, you can directly model the bimodal conditional distribution of the response. This is what I did here with a beta distribution for continuous bounded responses: http://webcolleges.uva.nl/mediasite/Viewer/?peid=c7a7b041327f4db09dc2fc3a7872aa5a1d HTH, Best, Yvonnick Noel University of Brittany, Rennes 2 France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Object oriented programming in R.
I think you should have a look at the 'proto' package on CRAN. Yvonnick Noel University of Brittany, Rennes France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice: Superimposing histograms with different colors and transparency effects
Dear users, I would like to plot several histograms superimposed on the same panel with different colors, with superimposed polygons appearing with transparency effects. I also want estimated densities to appear on the same plot. For several reasons, including that I like it, I want to use the lattice package. I have several questions regarding the use of the 'histogram' function with a group structure. I first thought that defining well-chosen values of alpha in trellis parameters would give the transparency effets, but this is not the case: library(lattice) # Some data x = c(rnorm(100), rnorm(100,2), rnorm(100,4)) grouping = gl(3,100,300) # Trellis parameters trellis.par.set(superpose.polygon=list(alpha=rep(.5,3))) histogram(~x ,groups=grouping, type = density, panel = panel.superpose, panel.groups = function(x,...) { panel.histogram(x,...) panel.mathdensity(dmath=dnorm,args = list(mean=mean(x),sd=sd(x)),...) }) Besides transparency, I get no filling colors at all in the, even though plot.polygon and superpose.polygon parameters are set. I clearly need to define my own colors with alpha channel set: mycolors = rgb(c(228, 55, 77), c(26, 126, 175), c(28, 184, 74),alpha = 50,maxColorValue = 255) ... and include 'mycolors' as an explicit argument in the histogram function: histogram(~x ,groups=grouping, type = density,ylim=c(0,.45), panel = panel.superpose,col=mycolors, auto.key=list(space=right,rectangles=FALSE,col=mycolors), panel.groups = function(x,...) { panel.histogram(x,...) panel.mathdensity(dmath=dnorm, args=list(mean=mean(x),sd=sd(x)),...) }) - First question: Is it the only mean to get histogram bars filled, or do I do something wrong in the use of trellis.arg.set ? The problem with the previous approach is that the 'col' argument also affects the density curves colors, for which I don't want transparency effects. The 'col.lines' argument doesn't seem to change anything. Removing the (...) arguments is not an interesting option, as it suppresses some useful parameters for histograms (breaks, etc.). - Second question: How do I get superimposed density curves with colors that differ from the bar colors (i.e. here: No transparency effects)? - Third question: How do I find nice (and common) ylim values for the three histograms? I have set ylim=c(0,.45) above by hand, but I would like to see this calibrated beforehand. Adding a prepanel function is probably the way to go, but I am not sure how to manage this. - Fourth question: I would like the bar borders to have colors that also vary from group to group, but unlike the 'col=' argument, adding a 'border=mycolors' argument in the histogram function call change colors from bar to bar! Thank you very much in advance. Best wishes, Yvonnick Noel, PhD. University of Brittany, Rennes France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wrong symbol rendering in plots (Ubuntu)
Hello Ben, Does the workaround pointed out later in the thread you're responding to (from the last paragraph of a very long 'Note' section of ?pdf) help? Well, I did not try to edit my fonts.conf but I feel this is not a PDF issue. I have no problem to have greek letters correctly rendered in Ubuntu 10.04 (with LaTeX, or OpenOffice Math for instance with a PDF export). This only appears in R plots. Note that other symbols do not render well too. I put a PDF output of demo(plotmath) here: http://yvonnick.noel.free.fr/wrongsymbolsinubuntu.pdf Thank you. Yvonnick sessionInfo() R version 2.11.0 (2010-04-22) i486-pc-linux-gnu locale: [1] LC_CTYPE=fr_FR.utf8 LC_NUMERIC=C [3] LC_TIME=fr_FR.utf8LC_COLLATE=fr_FR.utf8 [5] LC_MONETARY=C LC_MESSAGES=fr_FR.utf8 [7] LC_PAPER=fr_FR.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=fr_FR.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wrong symbol rendering in plots (Ubuntu)
Hello, I have the very same problem. Plotting code that used to work before I upgraded to Ubuntu Lucid Lynx does not work anymore. For example: plot(1:10) text(6,4,expression(pi)) The 'pi' greek letter appear as a \neq (different from symbol). Yvonnick Noel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to run Shapiro-Wilk test for each grouped
Iurie, Noel, thanks a lot. This will help me someday. But I have a question. When we run Shapiro-Wilk test, the homogenity of variances is a mandatory condition? No it is not. An homoscedasticity test only makes sense when you have a grouping factor, and a normality test may of course be used in a variety of contexts when you have a unique sample. My point was: If you use gaussian models and assume homogeneity of within-group variances, then testing normality is somewhat simplified, since your model residuals are expected to be drawn from a unique normal distribution, and only one normality test on the residuals is necessary (no need for a loop). Best wishes, Yvonnick NOEL, PhD. University of Brittany, Rennes 2 France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to run Shapiro-Wilk test for each grouped variable?
Dear Iurie, I want to run Shapiro-Wilk test for each variable in my dataset, each grouped by variable groupFactor. Note that, at least on a single dependent variable with a grouping variable, a possible simplification may arise when homogeneity of variances is assumed and reasonable. You may want to do a single normality test on group-centered data : shapiro.test(residuals(lm(data[,1]~groupFactor))) HTH, Yvonnick Noel University of Brittany, Rennes 2 France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing two groups of proportions
Hi Ivan, It was not clear from your original post that QA was a repeated factor. But your problem may be reframed very much like you would do with a McNemar chi-square: Just count the number of times both procedure give the same result, of each kind, and different results, again of both kinds, to get four counts by condition. These events are probably independent within your setting. You should then be able to test various binomial or Poisson models with the proper equality constraints. HTH, Yvonnick Noel, PhD University of Brittany at Rennes France Re: [R] Comparing two groups of proportions To: r-help@r-project.org Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=iso-8859-1 Hi Rolf, On Monday 09 June 2008 11:16:57 pm Rolf Turner wrote: Your approach tacitly assumes --- as did the poster's question --- that the probability of passing an item by one method is *independent* of whether it is passed by the other method. Which makes the methods effectively independent of the nature of the item being assessed! So it seems I can't just block my primary factor (QA procedure) by nuisance one (production line) and run Cochran test to see if effects of primary factor are identical for both its levels. Not much actual quality being assured there! In fact, I am not interested in quality of QA procedures as much as in how different the results are (error component). Thanks, Ivan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ancova_non-normality of errors
Hello Tobias, I am not sure what your wt variable is: I suspect a 'weight'. If it is a nonnegative measure, then you want a positive density model, not a normal density in the first place. I think you should try a Gamma GLM, and look at a Gamma qqplot within each of your conditions. You could try the following: M1 = glm(wt ~ pes + origin + gender + gender:pes, family=Gamma(link=identity)) M2 = glm(wt ~ pes + origin + gender + gender:pes, family=Gamma(link=log)) M3 = glm(wt ~ pes + origin + gender + gender:pes, family=Gamma(link=inverse)) and see whether one of them fit better, in terms of qqplot adjustment or comparative fit indicies (AIC, BIC,...). HTH, Yvonnick Noel, PhD University of Brittany France Message: 1 Date: Sun, 04 May 2008 11:56:09 +0200 From: Tobias Erik Reiners [EMAIL PROTECTED] Subject: [R] Ancova_non-normality of errors To: r-help@r-project.org Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=ISO-8859-1; DelSp=Yes; format=flowed Hello Helpers, I have some problems with fitting the model for my data... --my Literatur says (crawley testbook)= Non-normality of errors--I get a banana shape Q-Q plot with opening of banana downwards The goal of my analysis is to work out what effect the categorial factors(origin, gender) on the relation between log(wt)~log(pes)(--Condition, fett ressource), have. Does the source(origin) of translocated animals have an affect on performance(condition)in the new area? I have already a best fit model and it looks quite good (or not?see below). two slopes(gender difference)and 6 intercepts(3origin levels*2gender levels) lm(formula = log(wt) ~ log(pes) + origin + gender + gender:log(pes)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiply a 3D-array by a vector (weighted combination of matrices)
Hello, I would like to compute a weighted combination of matrices. I have a number of matrices, arranged in a 3D-array, say: z = array(rep(1:3,c(9,9,9)),c(3,3,3)) so that z[,,1] is my first matrix, and z[,,2] and z[,,3] the second and third one, and a vector of coefficients: w = rep(1/3,3) I would like to compute: w[1]* z[,,1] + w[2]*z[,,2] + w[3]*z[,,3] I could of course do this using a for() loop, but would like to know if there is a way to do it in a vectorized manner, or any other way that is likely to result in faster computation. Any hint ? Thank you very much in advance, YNOEL __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.