Re: [R] Help optimizing EMD::extrema()
Wow, great work! This makes a huge difference (i.e. 5 mins to process an electrode instead of 5 hours). I believe that the ndata and ndatam1 are there to let EMD::emd() and EMD::extractimf() (the functions that call EMD::extrema()) implement various boundary corrections. I personally don't intend on examining the start or end in great detail, so I think the uncorrected version is fine for my application. Thanks again! Mike On Sun, Feb 13, 2011 at 6:57 PM, William Dunlap wdun...@tibco.com wrote: I did not try to emulate the ndata nad ndatam1 arguments to extrema(), as I didn't see what they were for. The help file simply says they are the length of the first argument and that minus 1 and that is what their default values are. If they do not have their default values then extrema() frequently dies. You could add them them to the argument list and not use them, or check that they are what they default to, as in function(x, ndata, ndatam1) { stopifnot(length(x)==ndata, ndata-1==ndatam1) ... rest of code ... If the check fails then someone needs to say or figure out what they are for. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap Sent: Sunday, February 13, 2011 2:08 PM To: Mike Lawrence; r-h...@lists.r-project.org Subject: Re: [R] Help optimizing EMD::extrema() -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mike Lawrence Sent: Friday, February 11, 2011 9:28 AM To: r-h...@lists.r-project.org Subject: [R] Help optimizing EMD::extrema() Hi folks, I'm attempting to use the EMD package to analyze some neuroimaging data (timeseries with 64 channels sampled across 1 million time points within each of 20 people). I found that processing a single channel of data using EMD::emd() took about 8 hours. Exploration using Rprof() suggested that most of the compute time was spent in EMD::extrema(). Looking at the code for EMD:extrema(), I managed to find one obvious speedup (switching from employing rbind() to c()) and I suspect that there may be a way to further speed things up by pre-allocating all the objects that are currently being created with c(), but I'm having trouble understanding the code sufficiently to know when/where to try this and what sizes to set as the default pre-allocation length. Below I include code that demonstrates the speedup I achieved by eliminating calls to rbind(), and also demonstrates that only a few calls to c() seem to be responsible for most of the compute time. The files extrema_c.R and extrema_c2.R are available at: https://gist.github.com/822691 Try the following new.extrema(). It is based on looking at runs in the data in a vectorized way. On my old laptop the running times for length(x)=2^(2:118) with EMD::extrema and new.extrema are old.time new.time 4 0.00 0.00 8 0.00 0.00 16 0.00 0.00 32 0.00 0.00 64 0.00 0.00 128 0.00 0.00 256 0.00 0.00 512 0.02 0.00 1024 0.03 0.00 2048 0.06 0.01 4096 0.14 0.00 8192 0.37 0.02 16384 1.08 0.03 32768 3.64 0.06 65536 13.35 0.12 131072 48.42 0.25 262144 206.17 0.59 isEndOfStrictlyIncreasingRun - function(x) { c(FALSE, diff(diff(x) 0) 0, FALSE) } isStartOfStrictlyIncreasingRun - function(x) { c(FALSE, diff(diff(x) = 0) 0, FALSE) } isEndOfStrictlyDecreasingRun - function(x) { isEndOfStrictlyIncreasingRun(-x) } isStartOfStrictlyDecreasingRun - function(x) { isStartOfStrictlyIncreasingRun(-x) } isStartOfZeroRun - function(x) { x==0 c(TRUE, x[-length(x)]!=0) } nToEndOfCurrentFlatRun - function(x) { # for each element of x, how far to end of current # run of equal values. rev( sequence(rle(rev(x))$lengths) - 1L) } indexOfEndOfCurrentFlatRun - function(x) { # should be a more direct way of doing this, but this is pretty quick seq_len(length(x)) + nToEndOfCurrentFlatRun(x) } new.extrema - function(x) { f - indexOfEndOfCurrentFlatRun(x) isMaxStart - isEndOfStrictlyIncreasingRun(x) isStartOfStrictlyDecreasingRun(x)[f] maxindex - cbind(which(isMaxStart), f[isMaxStart], deparse.level=0) isMinStart - isEndOfStrictlyDecreasingRun(x) isStartOfStrictlyIncreasingRun(x)[f] minindex - cbind(which(isMinStart), f[isMinStart], deparse.level=0) # zero-crossings are bit weird: Report either the before-zero/after-zero # pair or the start and stop of a a run of zero's (even if the run is # not part of an actual zero-crossing). Do them separately then sort. # Also, if the entire sequence never actually crosses 0, do # not report the zero
[R] Help optimizing EMD::extrema()
Hi folks, I'm attempting to use the EMD package to analyze some neuroimaging data (timeseries with 64 channels sampled across 1 million time points within each of 20 people). I found that processing a single channel of data using EMD::emd() took about 8 hours. Exploration using Rprof() suggested that most of the compute time was spent in EMD::extrema(). Looking at the code for EMD:extrema(), I managed to find one obvious speedup (switching from employing rbind() to c()) and I suspect that there may be a way to further speed things up by pre-allocating all the objects that are currently being created with c(), but I'm having trouble understanding the code sufficiently to know when/where to try this and what sizes to set as the default pre-allocation length. Below I include code that demonstrates the speedup I achieved by eliminating calls to rbind(), and also demonstrates that only a few calls to c() seem to be responsible for most of the compute time. The files extrema_c.R and extrema_c2.R are available at: https://gist.github.com/822691 Any suggestions/help would be greatly appreciated. #load the EMD library for the default version of extrema library(EMD) #some data to process values = rnorm(1e4) #profile the default version of extrema Rprof(tmp - tempfile()) temp = extrema(values) Rprof() summaryRprof(tmp) #1.2s total with most time spend doing rbind unlink(tmp) #load a rbind-free version of extrema source('extrema_c.R') Rprof(tmp - tempfile()) temp = extrema_c(values) Rprof() summaryRprof(tmp) #much faster! .5s total unlink(tmp) #still, it encounters slowdowns with lots of data values = rnorm(1e5) Rprof(tmp - tempfile()) temp = extrema_c(values) Rprof() summaryRprof(tmp) #44s total, hard to see what's taking up so much time unlink(tmp) #load an rbind-free version of extrema that labels each call to c() source('extrema_c2.R') Rprof(tmp - tempfile()) temp = extrema_c2(values) Rprof() summaryRprof(tmp) #same time as above, but now we see that it spends more time in certain calls to c() than others unlink(tmp) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] ez version 3.0
Hi folks, I'd like to announce the release of version 3.0 of the ez package. This package was developed to aid those that are new to statistical programming. Over the course of several years of helping colleagues and students learn R, I observed that folks are often initially turned off R because they have difficulty obtaining SPSS-like results quickly (SPSS is the dominant environment in my field, psychology). ez attempts to fill this gap, providing quick and easy analysis and graphics for common experimental designs. By easing the early portions of the R learning curve, ez hopes to promote the spread of R as a means of open source and reproducible analysis. ez may also be of interest to more advanced users as it includes the ezMixed() function, which automates the assessment of fixed effects in a mixed effects modelling context, and the ezPredict() function, which obtains predictions for the fixed effects from a mixed effects model. Installing ez Version 3.0 of ez requires that you have R 2.12.1 installed, which is the latest version of R as of this notice. If you have an older version of R you will need to update R by installing the latest version (from http://cran.r-project.org/) before installing ez. Windows and linux users should be able to install the latest version by running the command: install.packages( 'ez' ) Mac users should be able to install the latest version by running the commands: install.packages( c( 'car' , 'reshape2' , 'plyr' , 'ggplot2' , 'stringr' , 'lme4' , 'Matrix' ) ) install.packages( 'ez' , type='source' , dependencies=F ) Once installed, running the following commands will load ez and bring up its help page that links to descriptions of all ez's functions: library( ez ) ?ez Big changes in version 3.0 - A big rework of ezANOVA() to permit more flexibility, including more nuanced handling of numeric predictor variables, specification of sums-of-squares types when data is imbalanced, and an option to compute/return an aov object representing the requested ANOVA for follow-up contrast analysis. (The latter two features follow from the discussion at http://stats.stackexchange.com/questions/6208/should-i-include-an-argument-to-request-type-iii-sums-of-squares-in-ezanova) - An important bugfix for ezMixed(), which previously permitted specification of multiple random effects but silently ignored all but the last! - A big rework of ezMixed(), completely changing the output (including removal of p-values following advice of Pinero Bates, 2000, and many on the R-SIG-Mixed-Models mailing list) and providing a new feature whereby the linearity of numeric fixed effects can be assessed by comparison to higher polynomial degrees. Also new As noted above, this version fixes a big bug in ezMixed() about which I wish I could have warned users sooner. To facilitate future rapid notification of users, I've created a discussion group (http://groups.google.com/group/ez4r) to which users can/should subscribe to keep up to date on development news. Additionally, I created a project page on github (https://github.com/mike-lawrence/ez/issues) where users can submit bug reports and feature requests. Finally, I encourage users to rate or review ez (and any other packages you use) at crantastic.org (http://crantastic.org/packages/ez). Official CHANGES entry 3.0-0 - added urls in all documentation pointing to the bug-report/feature-request site (https://github.com/mike-lawrence/ez/issues) and the discussion group (http://groups.google.com/group/ez4r). - changed reshape dependency to reshape2 - ezANOVA - fixed bug such that if detailed=F and car:Anova fails, the first line of results from stats:aov is cut off. - Added more nuanced treatment of numeric variables - Added type argument to specify sums-of-squares type selection (1,2, or 3) - Added white.adjust argument to permit heteroscedasticity adjustment (see ?car::Anova) - Added return_aov argument to permit returning a stats:aov object along with results (this was requested by a user seeking to use the aov object to compute contrasts) - ezMixed - IMPORTANT: Fixed bug such that only the last specified random effect was implemented - fixed bug such that an error occurred if only one fixed effect was specified - changed output format to a list containing a summary data frame, a list of formulae, a list of errors, a list of warnings, and (optionally) a list of fitted models - Changed format of summary output including removal of p-values (on the strong advice of many that the p-values from a likelihood ratio test of a fixed effect is highly questionable) - removed the return_anovas argument - added nuanced ability to explore non-linear effects via addition of fixed_poly and fixed_poly_max arguments - ezPredict - Added ability to handle models fit with I()'d variables - Added stop error when encountering models with poly() and no supplied prediction data
[R] hierarchical mixture modelling
Hi folks, I have circular data that I'd like to model as a mixture of a uniform and a Von Mises centered on zero. The data are from 2 groups of human participants, where each participant is measured repeatedly within each of several conditions. So, the data frame would look something like: design = expand.grid( person = 1:20 , variable1 = 1:2 , variable2 = 1:3 , repetition = 1:100 ) design$group = design$person %% 2 where each row would have a data point. Now, I know how to fit the mixture of a uniform and Von Mises to each individual cell of the data independently using the EM algorithm, yielding estimates of the mixture proportion and Von Mises concentration per cell. However, I of course want to assess the degree to which the group and other variables in the design affect these model parameters, and at least in the case of the proportion estimate, I'm uncomfortable submitting the raw proportion to a test that is going to assume Gaussian error (eg. ANOVA, or lmer(..., family=gaussian)). I'm aware that lmer lets one specify non-gaussian links, but as I understand it, if I wanted to, say, specify the binomial link (which seems appropriate for a proportion), lmer wants the data to be the raw 1's and 0's, not the proportion estimate obtained from EM. I've heard that there are hierarchical mixture modelling methods out there (possibly Bayesian hierarchical mixture modelling) that might let me achieve model fitting and inference in one step (eg. model the mixture and influence on each parameter from the between and within-person variables, and treating people as random effects), but I'm having trouble tacking down instructions on how to do this. Any pointers would be greatly appreciated! Cheers, Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] ez version 2.0
The ez package was developed to aid those that are new to statistical programming. Over the course of several years of helping colleagues and students learn R, I observed that folks are often initially turned off R because they have difficulty obtaining SPSS-like results quickly (SPSS is the dominant environment in my field, psychology). ez attempts to fill this gap, providing quick and easy analysis and graphics for common experimental designs. By easing the early portions of the R learning curve, ez hopes to promote the spread of R as a means of open source and reproducible analysis. ez now also attempts to pique interest in cutting-edge statistical methods by providing easy specification and visualization of simple* mixed effects models. *-- mixed effects models are limited to those with a single random effect (eg. Participant) and no numeric predictors. New in version 2.0 - ezDesign(), a function to visualize the balance of data across a specified experimental design (useful for diagnosing missing data) - ezPrecis(), a function to summarize a data set (inspired by summary(), str(), and YaleToolkit:whatis() ) - ezBoot() and ezPlotBoot(), functions to compute and visualize (respectively) bootstrapped confidence intervals for either cell means or predictions from a mixed effects model - ezANOVA() updated with an improved measure of effect size: generalized eta-square. - ezPlot() updated to permit simultaneous plotting of multiple DV's, with each mapped to a row of the plot facets. - see changelog for further changes Enjoy! Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] compare gam fits
Hi folks, I originally tried R-SIG-Mixed-Models for this one (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q3/004170.html), but I think that the final steps to a solution aren't mixed-model specific, so I thought I'd ask my final questions here. I used gamm4 to fit a generalized additive mixed model to data from a AxBxC design, where A is a random effect (human participants in an experiment), B is a 2-level factor predictor variable, and C is a continuous variable that is likely non-linear. I tell gamm4 to fit a smooth across C to each level of B independently, and I can use predict.gam(...,se.fit=T) to obtain predictions from the fitted model as well as the standard error for the prediction. I'd like to visualize the BxC interaction to see if smoothing C within each level of B was really necessary, and if so, where it is (along the C dimension) that B affects the smooth. It's easy enough to obtain the predicted B1-B2 difference function, but I'm stuck on how to convey the uncertainty of this function (e.g. computing the confidence interval of the difference at each value of C). One thought is that predict.gam(...,se.fit=T) returns SE values, so if I could find out the N on which these SE values are computed, I could compute the difference CI as sqrt( ( (SD_B1)^2 + (SD_B2)^2 ) / N ) * qt( .975, df=N-1 ) However, I can't seem to figure out what value of N was used to compute the SEs that predict.gam(...,se.fit=T) produces. Can anyone point me to where I might find N? Further, is N-1 the proper df for the call to qt()? Finally, with a smooth function and 95% confidence intervals computed at each of a large number of points, don't I run into a problem of an inflated Type I error rate? Or does the fact that each point is not independent from those next to it make this an inappropriate concern? Cheers, Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)
Check out the reshape package for transforming data from long to wide and vice versa. Yet I still don't know what problem you've encountered with ezANOVA. Using the data you just sent, where Day now has 3 levels, I reformat back to the presumably original long format and find that ezANOVA returns the same sphericity tests as John's solution (which is expected because ezANOVA is a wrapper to Anova): library(ez) a = read.table( 'Sergios_wide_data.txt' , header=T ) b = melt.data.frame( data=a , id.vars=c('subject','treatment') , variable_name='day' ) ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(treatment,day) ) On Mon, Nov 9, 2009 at 10:20 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Thank you very much. Finally got it to work. However, I had to recode it from: columns: subject/treatment/DV (where all my response data was in one DV column) to columns: subject/treatment/day1/day2/day3/ (where my response data is now in three different columns). Is there a way to do that without hand recoding (cutting and pasting in spreadsheet) by hand? Thank you for your help. Glad it works as is. Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 7:12 PM, John Fox j...@mcmaster.ca wrote: Dear Sergios, Why don't you try what I suggested originally? Adapted to this data set, mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset) idata - data.frame(Day=factor(1:3)) summary(Anova(mod, idata=idata, idesign=~Day)) Peter Dalgaard also pointed toward an article that describes how to do the same thing with anova(). Regards, John -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sergios (Sergey) Charntikov Sent: November-09-09 7:13 PM To: Mike Lawrence Cc: r-help@r-project.org Subject: Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package) Hi Mike, I tried to run my data in SPSS and it works fine without any problems, plug in my levels, plug in my covariate (since it is all within) and get my Mauchly Tests. I tried to rearrange the data so it looks like this subj/treatment/day1/day2/day3 subject treatment day1 day2 day3 1 1 8 8 8 1 2 5 7 5 2 1 7 4 4 2 2 4 5 7 3 1 8 6 4 3 2 5 2 4 4 1 2 9 4 4 2 1 9 1 5 1 4 8 1 5 2 7 8 2 6 1 4 7 2 6 2 4 5 2 When I try mlmfit - lm(Dataset~1), I get invalid type (list) for variable 'Dataset When I try mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset) idata- data.frame(factor(rep(c(Dataset$day1, Dataset$day2, Dataset$day3))), ordered(Dataset$Treatment)) Anova(mod, idata=idata, idesign=~Dataset$Treatment) I get: Terms in the intra-subject model matrix are not orthogonal. When I try is.matrix(Dataset) - I get no. My original mock Dataset (attached in txt) is below. Maybe I am not coding it right? I would hate to recode all my data for SPSS, since at the end I would need to show that Sphericity was not violated. Subj Trtmt Sessn Response 1 N 1 5 1 D 1 6 1 N 2 4 1 D 2 7 2 N 1 8 2 D 1 9 2 N 2 2 2 D 2 1 3 N 1 4 3 D 1 5 3 N 2 6 3 D 2 2 4 N 1 5 4 D 1 6 4 N 2 4 4 D 2 7 5 N 1 8 5 D 1 9 5 N 2 2 5 D 2 1 6 N 1 4 6 D 1 5 6 N 2 6 6 D 2 2 Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 5:29 PM, Mike Lawrence mike.lawre...@dal.ca wrote: No luck as in...? What error did you encounter? In your example data set, you only have 2 levels of each within-Ss factor, in which case you shouldn't expect to obtain tests of sphericity; as far as I understand it, sphericity necessarily holds when for repeated measures with only 2 levels and tests are really only possible for repeated measures with 3 or more levels. I think it's analogous to how you don't need to test homogeneity of variance when performing a paired t-test; the test ends up representing the pairs as single distribution of difference scores with a single variance. Mike On Mon, Nov 9, 2009 at 5:30 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Tried EZanova, no luck with my particular dataset
Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)
Oops, I see now that despite repeated subject names, treatment is a between-Ss variable, so you need to use this to get the equivalent of Anova: library(ez) a = read.table( 'Sergios_wide_data.txt' , header=T ) b = melt.data.frame( data=a , id.vars=c('subject','treatment') , variable_name='day' ) b$subject=paste(b$subject,b$treatment) #create unique sids ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(day) , between=.(treatment) ) On Tue, Nov 10, 2009 at 6:47 AM, Mike Lawrence mike.lawre...@dal.ca wrote: Check out the reshape package for transforming data from long to wide and vice versa. Yet I still don't know what problem you've encountered with ezANOVA. Using the data you just sent, where Day now has 3 levels, I reformat back to the presumably original long format and find that ezANOVA returns the same sphericity tests as John's solution (which is expected because ezANOVA is a wrapper to Anova): library(ez) a = read.table( 'Sergios_wide_data.txt' , header=T ) b = melt.data.frame( data=a , id.vars=c('subject','treatment') , variable_name='day' ) ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(treatment,day) ) On Mon, Nov 9, 2009 at 10:20 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Thank you very much. Finally got it to work. However, I had to recode it from: columns: subject/treatment/DV (where all my response data was in one DV column) to columns: subject/treatment/day1/day2/day3/ (where my response data is now in three different columns). Is there a way to do that without hand recoding (cutting and pasting in spreadsheet) by hand? Thank you for your help. Glad it works as is. Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 7:12 PM, John Fox j...@mcmaster.ca wrote: Dear Sergios, Why don't you try what I suggested originally? Adapted to this data set, mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset) idata - data.frame(Day=factor(1:3)) summary(Anova(mod, idata=idata, idesign=~Day)) Peter Dalgaard also pointed toward an article that describes how to do the same thing with anova(). Regards, John -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sergios (Sergey) Charntikov Sent: November-09-09 7:13 PM To: Mike Lawrence Cc: r-help@r-project.org Subject: Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package) Hi Mike, I tried to run my data in SPSS and it works fine without any problems, plug in my levels, plug in my covariate (since it is all within) and get my Mauchly Tests. I tried to rearrange the data so it looks like this subj/treatment/day1/day2/day3 subject treatment day1 day2 day3 1 1 8 8 8 1 2 5 7 5 2 1 7 4 4 2 2 4 5 7 3 1 8 6 4 3 2 5 2 4 4 1 2 9 4 4 2 1 9 1 5 1 4 8 1 5 2 7 8 2 6 1 4 7 2 6 2 4 5 2 When I try mlmfit - lm(Dataset~1), I get invalid type (list) for variable 'Dataset When I try mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset) idata- data.frame(factor(rep(c(Dataset$day1, Dataset$day2, Dataset$day3))), ordered(Dataset$Treatment)) Anova(mod, idata=idata, idesign=~Dataset$Treatment) I get: Terms in the intra-subject model matrix are not orthogonal. When I try is.matrix(Dataset) - I get no. My original mock Dataset (attached in txt) is below. Maybe I am not coding it right? I would hate to recode all my data for SPSS, since at the end I would need to show that Sphericity was not violated. Subj Trtmt Sessn Response 1 N 1 5 1 D 1 6 1 N 2 4 1 D 2 7 2 N 1 8 2 D 1 9 2 N 2 2 2 D 2 1 3 N 1 4 3 D 1 5 3 N 2 6 3 D 2 2 4 N 1 5 4 D 1 6 4 N 2 4 4 D 2 7 5 N 1 8 5 D 1 9 5 N 2 2 5 D 2 1 6 N 1 4 6 D 1 5 6 N 2 6 6 D 2 2 Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 5:29 PM, Mike Lawrence mike.lawre...@dal.ca wrote: No luck as in...? What error did you encounter? In your example data set, you only have 2 levels of each within-Ss factor, in which case you shouldn't expect to obtain tests of sphericity; as far as I understand it, sphericity necessarily
[R] Quickly generate all possible combinations of 2 groups
Hi all, I suspect the answer to this query will be the tongue-in-cheek use a quantum computer, but I thought my understanding might be sufficiently limited that I'm missing a simpler option. I'm looking for a way to cycle through all possible combinations of 2 groups of data. For 10 data points, that's 2^10 combinations, for 20 data points it's 2^20, etc. combn() from the combinat package is useful to this purpose, as in: library(combinat) n=20 #number of data points in the data set start=proc.time()[3] #start a timer pb = txtProgressBar(max=n+1,style=3) #initialize a progress bar for(i in 0:n){ #for each possible size of set1 Set1 = combn(1:n,i) #get set1 combinations Set2 = rev(combn(1:n,n-i)) #get set2 combinations setTxtProgressBar(pb,i+1) #increment the progress bar } close(pb) #close the progress bar proc.time()[3]-start #show the time taken (about 40s @ 2.2GHz) However, this obviously ends up being too slow when the number of data points rises much above 20 (I'll likely be dealing with data sets to a maximum of 200 points). In case it's relevant, the motivation behind this problem is that I'm seeking an alternative to EM or simplex methods to obtaining the MLE of mixture data. Given a mixture model consisting of 2 distributions, I should be able to obtain an MLE by finding the partitioning of the data into 2 groups that yields the highest likelihood. I'm specifically looking at modelling circular data by a mixture of uniform and a Von Mises centered on zero, so once I have a given partition, I can estimate parameters of the model (proportion of points drawn from the Von Mises and concentration of the Von Mises) analytically and compute the likelihood of the data given that pair of parameters. I've coded a variant of this approach that generates random partitioning of data, and this seems to to a decent job of generating something that might be useful as a starting point for a subsequent EM or simplex search, but I thought I might double check with the list to see if there's a computationally efficient solution to the test all combinations scheme. Cheers, Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)
Have you tried ezANOVA from the ez pacakge? It attempts to provide a simple user interface to car's ANOVA (and when that fails, aov). On Mon, Nov 9, 2009 at 1:44 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Hello everyone, I am trying to do within subjects repeated measures anova followed by the test of sphericity (sample dataset below). I am able to get either mixed model or linear model anova and TukeyHSD, but have no luck with Repeated-Measures Assuming Sphericity or Separate Sphericity Tests. I am trying to follow example from car package, but it seems that I am not getting something right. Dataset$Sessn - as.factor(Dataset$Sessn) LinearModel.1 - lm(Response ~ Sessn*Trtmt, data=Dataset) summary(LinearModel.1) All, good so far, but I have problem understanding idata= and idesign= functions pertaining to my example. Session is my repeated measure (Sessn 1 and Sessn 2 = two sessions, in reality I have more) and it is already stacked. Any help or guidance on this matter. Thank you, my mock dataset is below. Each subject has two levels of treatment throughout four calendar days which are recoded to Session 1 and Session 2 in order to compare treatments by the first and subsequent days of exposure (Treatment x Session; my DV is Response; Session is repeated). Subj Trtmt Sessn Response 1 N 1 5 1 D 1 6 1 N 2 4 1 D 2 7 2 N 1 8 2 D 1 9 2 N 2 2 2 D 2 1 3 N 1 4 3 D 1 5 3 N 2 6 3 D 2 2 4 N 1 5 4 D 1 6 4 N 2 4 4 D 2 7 5 N 1 8 5 D 1 9 5 N 2 2 5 D 2 1 6 N 1 4 6 D 1 5 6 N 2 6 6 D 2 2 Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA sergioschr-at-gmail-dot-com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)
No luck as in...? What error did you encounter? In your example data set, you only have 2 levels of each within-Ss factor, in which case you shouldn't expect to obtain tests of sphericity; as far as I understand it, sphericity necessarily holds when for repeated measures with only 2 levels and tests are really only possible for repeated measures with 3 or more levels. I think it's analogous to how you don't need to test homogeneity of variance when performing a paired t-test; the test ends up representing the pairs as single distribution of difference scores with a single variance. Mike On Mon, Nov 9, 2009 at 5:30 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Tried EZanova, no luck with my particular dataset. Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA On Mon, Nov 9, 2009 at 2:25 PM, Mike Lawrence mike.lawre...@dal.ca wrote: Have you tried ezANOVA from the ez pacakge? It attempts to provide a simple user interface to car's ANOVA (and when that fails, aov). On Mon, Nov 9, 2009 at 1:44 PM, Sergios (Sergey) Charntikov sergios...@gmail.com wrote: Hello everyone, I am trying to do within subjects repeated measures anova followed by the test of sphericity (sample dataset below). I am able to get either mixed model or linear model anova and TukeyHSD, but have no luck with Repeated-Measures Assuming Sphericity or Separate Sphericity Tests. I am trying to follow example from car package, but it seems that I am not getting something right. Dataset$Sessn - as.factor(Dataset$Sessn) LinearModel.1 - lm(Response ~ Sessn*Trtmt, data=Dataset) summary(LinearModel.1) All, good so far, but I have problem understanding idata= and idesign= functions pertaining to my example. Session is my repeated measure (Sessn 1 and Sessn 2 = two sessions, in reality I have more) and it is already stacked. Any help or guidance on this matter. Thank you, my mock dataset is below. Each subject has two levels of treatment throughout four calendar days which are recoded to Session 1 and Session 2 in order to compare treatments by the first and subsequent days of exposure (Treatment x Session; my DV is Response; Session is repeated). Subj Trtmt Sessn Response 1 N 1 5 1 D 1 6 1 N 2 4 1 D 2 7 2 N 1 8 2 D 1 9 2 N 2 2 2 D 2 1 3 N 1 4 3 D 1 5 3 N 2 6 3 D 2 2 4 N 1 5 4 D 1 6 4 N 2 4 4 D 2 7 5 N 1 8 5 D 1 9 5 N 2 2 5 D 2 1 6 N 1 4 6 D 1 5 6 N 2 6 6 D 2 2 Sincerely, Sergios Charntikov (Sergey), MA Behavioral Neuropharmacology Lab Department of Psychology University of Nebraska-Lincoln Lincoln, NE 68588-0308 USA sergioschr-at-gmail-dot-com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EM algorithm to fit circular mix of uniform+Von Mises
Hi all, I'm curious if anyone has coded an Expectation-Maximization algorithm that could help me model some circular data I have. I'd like to model it as a mixture of uniform and Von Mises centered on 0, so the only free parameters is the mixing proportion and the kappa of the Von Mises. I couldn't find anything in the contributed packages that seemed to suit this purpose. Any pointers would be greatly appreciated! Cheers, Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a specific skewed distribution
With skewed unimodal distributions, the mode can't equal the mean, so assuming you want the mean to be around 30, I find that a weibull function can get close to what you want: mean(rweibull(1e5,1.5,33)) [1] 29.77781 pweibull(60,1.5,33) [1] 0.9138475 I'm sure you can play with the parameters to try to get even closer to what you want. On Wed, Jun 10, 2009 at 2:48 AM, David Arnolddwarnol...@suddenlink.net wrote: All, Can someone help me create a skewed distribution, mean = 30, with probability of selecting a random number from the distribution greater than or equal 60 equal to 10%? I need the probability density function to equal zero at zero, and have a maximum height at or near 30. Is this possible? And if it is possible, how can I adjust the distribution so that the probability of selecting a random number greater than or equal to 60 is p. Thanks. No idea how to start. David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphically representing frequency of words in a speech?
Below are various attempts using using ggplot2 (http://had.co.nz/ggplot2/). First I try random positioning, then random positioning with alpha, then a quasi-random position scheme in polar coordinates: #this demo has random number generation # so best to set a seed to make it # reproducible. set.seed(1) #generate some fake data a = data.frame( word = month.name , freq = sample(1:10,12,replace=TRUE) ) #add arbitrary location information a$x = sample(1:12,12) a$y = sample(1:12,12) #load ggplot2 library(ggplot2) #initialize a ggplot object my_plot = ggplot() #create an object for the text layer my_text = geom_text( data = a , aes( x = x , y = y , label = word , size = freq ) ) #create an object for the text size limits my_size_scale = scale_size( to = c(3,20) ) #create an object to expand the x-axis limits # (ensures that text isn't cropped) my_x_scale = scale_x_continuous( expand = c(.5, 0) ) #ditto for the y axis my_y_scale = scale_y_continuous( expand = c(.5, 0) ) #create an opts object that removes # plot elements unnecessary in a tag cloud my_opts = opts( legend.position = 'none' , panel.grid.minor = theme_blank() , panel.grid.major = theme_blank() , panel.background = theme_blank() , axis.line = theme_blank() , axis.text.x = theme_blank() , axis.text.y = theme_blank() , axis.ticks = theme_blank() , axis.title.x = theme_blank() , axis.title.y = theme_blank() ) #show the plot print( my_plot+ my_text+ my_size_scale+ my_x_scale+ my_y_scale+ my_opts ) #to aid readability amidst overlap, set alpha in # the call to geom_text my_text_with_alpha = geom_text( data = a , aes( x = x , y = y , label = word , size = freq ) , alpha = .5 ) #show the version with alpha print( my_plot+ my_text_with_alpha+ my_size_scale+ my_x_scale+ my_y_scale+ my_opts ) #alternatively, in polar coordinates, # which maps x to angle and y to radius, # making a nice circle print( my_plot+ my_text_with_alpha+ my_size_scale+ my_opts+ coord_polar() ) #(note omission of my_y_scale # my_x_scale, which seem to be ignored # when coord_polar() is called. I'll # report this possible bug to the ggplot2 # maintainer) #a possible way to avoid overlap is to # map radius (y) to frequency so that # larger text is in the periphery # where there is more room. This # necessitates adding some random # noise to the frequency so that # the low frequency words don't # jumble in the center too badly a$freq2 = a$freq+rnorm(12) #now map radius (y) to freq2 my_text_with_alpha_and_freq2 = geom_text( data = a , aes( x = x , y = freq2 , label = word , size = freq ) , alpha = .5 ) #show the version with alpha radius mapped to freq2 print( my_plot+ my_text_with_alpha_and_freq2+ my_size_scale+ my_opts+ coord_polar() ) -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting a Weibull Distribution
http://lmgtfy.com/?q=fit+weibull+distribution+R Hit #8: http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf On Fri, Jun 5, 2009 at 7:21 PM, r...@rstatx.com wrote: How do you fit a Weibull distribution in R? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Anyone using Bespin R?
Just curious whether anyone has any experiences to relate with regards to using Bespin (https://bespin.mozilla.com/) as a text editor for your R scripting. Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Within Subject ANOVA question
Although you factorized subject and condition when you created them as separate objects, this didn't survive cbind() then as.data.frame(), thus example-data.frame(cbind(subject,condition,recall)) str(example) 'data.frame': 30 obs. of 3 variables: $ subject : int 1 2 3 4 5 6 7 8 9 10 ... $ condition: int 1 1 1 1 1 1 1 1 1 1 ... $ recall : int 6 3 7 16 12 11 1 8 5 4 ... Use this alternative construction (note that I omit factorization of the response variable, recall, because this shouldn't be a factor) subject-factor(c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)) condition-factor(c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5)) recall-c(10,6,11,22,16,15,1,12,9,8,13,8,14,23,18,17,1,15,12,9,13,8,14,25,20,17,4,17,12,12) example-data.frame(subject,condition,recall) str(example) 'data.frame': 30 obs. of 3 variables: $ subject : Factor w/ 10 levels 1,2,3,4,..: 1 2 3 4 5 6 7 8 9 10 ... $ condition: Factor w/ 3 levels 1,2,5: 1 1 1 1 1 1 1 1 1 1 ... $ recall : num 10 6 11 22 16 15 1 12 9 8 ... aov.recall - aov(recall~condition + Error(subject/condition),data=example) summary(aov.recall) Error: subject Df Sum Sq Mean Sq F value Pr(F) Residuals 9 942.53 104.73 Error: subject:condition Df Sum Sq Mean Sq F valuePr(F) condition 2 52.267 26.133 42.506 1.519e-07 *** Residuals 18 11.067 0.615 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 On Mon, Jun 1, 2009 at 7:31 PM, tsunhin wong thjw...@gmail.com wrote: Dear R users, I have copied for following table from an article on Using confidence intervals in within-subject designs: Subject 1sec 2sec 5sec 1 10 13 13 12.00 2 6 8 8 7.33 3 11 14 14 13.00 4 22 23 25 23.33 5 16 18 20 18.00 6 15 17 17 16.33 7 1 1 4 2.00 8 12 15 17 14.67 9 9 12 12 11.00 10 8 9 12 9.67 I rearranged the data this way: subject-factor(c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)) condition-factor(c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5)) recall-factor(c(10,6,11,22,16,15,1,12,9,8,13,8,14,23,18,17,1,15,12,9,13,8,14,25,20,17,4,17,12,12)) example-cbind(subject,condition,recall) Using ANOVA (aov), I should have DF for condition = 2, DF of subjects = 9, Interaction DF = 18 And a term for mean square of interaction. (0.61) But, I have something like below instead: aov.recall - aov(recall~condition + Error(subject/condition),data=as.data.frame(example)) summary(aov.recall) Error: subject Df Sum Sq Mean Sq F value Pr(F) Residuals 1 12.898 12.898 Error: subject:condition Df Sum Sq Mean Sq condition 1 34.505 34.505 Error: Within Df Sum Sq Mean Sq F value Pr(F) condition 1 3.22 3.22 0.1378 0.7135 Residuals 26 607.54 23.37 The within-subject (repeated measure) anova of R seems to be a bit subtle for me, please point out the errors that I have made if you can. Thank you very much! - John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 legend
, 12L, 12L, 13L, 14L, 12L, 13L, 14L, 15L, 16L, 15L, 14L, 15L, 16L, 16L, 15L, 16L, 17L, 16L, 17L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 3L, 4L, 5L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 5L, 6L, 7L, 8L, 8L, 9L, 8L, 8L, 9L, 9L, 8L, 9L, 10L, 11L, 10L, 10L, 11L, 12L, 11L, 12L, 12L, 13L, 14L, 12L, 13L, 14L, 15L, 16L, 15L, 14L, 15L, 16L, 16L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 3L, 4L, 5L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 5L, 6L, 7L, 8L, 8L, 9L, 8L, 8L, 9L, 9L, 8L, 9L, 10L, 11L, 10L, 10L, 11L, 12L, 11L, 12L, 12L, 13L, 14L, 12L, 13L, 14L, 15L, 16L, 15L, 14L, 15L, 16L, 16L, 15L, 16L, 17L, 16L, 17L, 16L, 17L, 18L, 19L, 17L, 18L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 3L, 4L, 5L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 5L, 6L, 7L, 8L, 8L, 9L, 8L, 8L, 9L, 9L, 8L, 9L, 10L, 11L, 10L, 10L, 11L, 12L, 11L, 12L, 12L, 13L, 14L, 12L, 13L, 14L, 15L, 16L, 15L, 14L, 15L, 16L, 16L, 15L, 16L, 17L, 16L, 17L, 16L, 17L, 18L, 19L, 17L, 18L)), .Names = c(SampleDate, PondName, Muestreo, BodyWeight.g., Length.mm.), class = data.frame, row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428)) library(ggplot2) fishplot - qplot(PondName,BodyWeight.g.,data=fish_ByMuestreo,colour=Muestreo,position=jitter) + stat_summary(aes(group=Muestreo),fun.data=mean_cl_normal,colour=green,geom=smooth,fill=NA) + opts(title=Average weight(grs) by Pond) print(fishplot) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple ANOVA tests
#create some data y=rnorm(20) x=factor(rep(c('A','B'),each=10)) #run the anova my_aov = aov(y~x) #summarize the anova my_aov_summary = summary(my_aov) #show the anova summary print(my_aov_summary) #lets see what's in the summary object str(my_aov_summary) #looks like it's a list with 1 element which #in turn is a data frame with columns. #The Pr(F) column looks like what we want my_aov_summary[[1]]$P #yup, that's it. Grab the first value p = my_aov_summary[[1]]$P[1] On Wed, May 27, 2009 at 7:11 AM, Imri bisr...@agri.huji.ac.il wrote: Hi all - I'm trying to do multiple one-way ANOVA tests of different factors on the same variable. As a result I have a list with all the ANOVA tables, for exemple: $X11_20502 Analysis of Variance Table Response: MPH Df Sum Sq Mean Sq F value Pr(F) x 3 369.9 123.3 6.475 0.0002547 *** Residuals 635 12093.2 19.0 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 $X11_21067 Analysis of Variance Table Response: MPH Df Sum Sq Mean Sq F value Pr(F) x 1 26.7 26.7 1.3662 0.2429 Residuals 637 12436.4 19.5 $X11_10419 Analysis of Variance Table Response: MPH Df Sum Sq Mean Sq F value Pr(F) x 3 527.8 175.9 9.361 4.621e-06 *** Residuals 635 11935.3 18.8 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 My question is how can I extract from this list, just the Pr(F) values for each x ? -- View this message in context: http://www.nabble.com/Multiple-ANOVA-tests-tp23739615p23739615.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple ANOVA tests
you could use ldply from the plyr package: p = ldply(q,function(x){x$P}) Without you data I can't confirm that works, but something like that should do it On Wed, May 27, 2009 at 9:23 AM, Imri bisr...@agri.huji.ac.il wrote: Thanks for the answer!!! I Know how to extract the Pr(F) value from single ANOVA table, but I have a list of many ANOVA tables recived by : a-function(x)(aov(MPH~x)) q-apply(assoc[,18:20],2,a) # just for example, I have more than 3 factors(x) print(q) $X11_20502 Df Sum Sq Mean Sq F value Pr(F) x 3 369.9 123.3 6.475 0.0002547 *** Residuals 635 12093.2 19.0 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 246 observations deleted due to missingness $X11_21067 Df Sum Sq Mean Sq F value Pr(F) x 1 26.7 26.7 1.3662 0.2429 Residuals 637 12436.4 19.5 246 observations deleted due to missingness $X11_10419 Df Sum Sq Mean Sq F value Pr(F) x 3 527.8 175.9 9.361 4.621e-06 *** Residuals 635 11935.3 18.8 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 246 observations deleted due to missingness summary(q) Length Class Mode X11_20502 1 summary.aov list X11_21067 1 summary.aov list X11_10419 1 summary.aov list How can I extract all the Pr(F) values from q (not one by one)? Thanks Imri Mike Lawrence wrote: #create some data y=rnorm(20) x=factor(rep(c('A','B'),each=10)) #run the anova my_aov = aov(y~x) #summarize the anova my_aov_summary = summary(my_aov) #show the anova summary print(my_aov_summary) #lets see what's in the summary object str(my_aov_summary) #looks like it's a list with 1 element which #in turn is a data frame with columns. #The Pr(F) column looks like what we want my_aov_summary[[1]]$P #yup, that's it. Grab the first value p = my_aov_summary[[1]]$P[1] On Wed, May 27, 2009 at 7:11 AM, Imri bisr...@agri.huji.ac.il wrote: Hi all - I'm trying to do multiple one-way ANOVA tests of different factors on the same variable. As a result I have a list with all the ANOVA tables, for exemple: $X11_20502 Analysis of Variance Table Response: MPH Df Sum Sq Mean Sq F value Pr(F) x 3 369.9 123.3 6.475 0.0002547 *** Residuals 635 12093.2 19.0 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 $X11_21067 Analysis of Variance Table Response: MPH Df Sum Sq Mean Sq F value Pr(F) x 1 26.7 26.7 1.3662 0.2429 Residuals 637 12436.4 19.5 $X11_10419 Analysis of Variance Table Response: MPH Df Sum Sq Mean Sq F value Pr(F) x 3 527.8 175.9 9.361 4.621e-06 *** Residuals 635 11935.3 18.8 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 My question is how can I extract from this list, just the Pr(F) values for each x ? -- View this message in context: http://www.nabble.com/Multiple-ANOVA-tests-tp23739615p23739615.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Multiple-ANOVA-tests-tp23739615p23741437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labeling barplot bars by multiple factors
You can get something close with ggplot2: library(ggplot2) my_data = expand.grid( A = factor(c('A1','A2')) , B = factor(c('B1','B2')) , C = factor(c('C1','C2')) ) my_data$DV = rnorm(8,mean=10,sd=1) p = ggplot() p = p + layer( geom = 'bar' , stat = 'identity' , data = my_data , mapping = aes( x = C , y = DV , fill = B ) , position = 'dodge' ) p = p + facet_grid( A ~ . ) p = p + coord_flip() print(p) On Wed, May 27, 2009 at 1:01 PM, Thomas Levine thomas.lev...@gmail.com wrote: I want to plot quantitative data as a function of three two-level factors. How do I group the bars on a barplot by level through labeling and spacing? Here http://www.thomaslevine.org/sample_multiple-factor_barplot.png's what I'm thinking of. Also, I'm pretty sure that I want a barplot, but there may be something better. Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help wth boxplots
check out ggplot2: http://had.co.nz/ggplot2/ particularly: http://had.co.nz/ggplot2/geom_boxplot.html On Tue, May 26, 2009 at 10:34 AM, Amit Patel amitrh...@yahoo.co.uk wrote: Hi I have a vector of data lets call zz (40 values from 4 samples) the data is already in groups, i can even split up the samples using SampA - zz[,2:11] SampB - zz[,12:21] SampC - zz[,22:31] SampV - zz[,32:41] I would like an output that gives me 4 boxplots on one plot one boxplot for the set of 10 values how can i do this in R __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help me...!!!
http://www.r-project.org/posting-guide.html Basic statistics and classroom homework: R-help is not intended for these. On Tue, May 26, 2009 at 11:37 AM, abel1682 lizard_1...@yahoo.it wrote: Hi to all...i'm a new R'user and i have to solve some exercies so i ask to tou for an help... 1.) How i can demonstrate in R that the limit for x--infinite of (1+1/x)^x is equal to e? 2.) if i have a vector of values how can i create a function that, applied to my vector, give me median, mean, Var and length togheter? 3.)Find the minimum of this function: f(x)=(x-3)^4 with the Newton method. 4.) Define a function that is able to calculate the geometric mean of a seriation: Sorry for all these questions... Thanks a lot!!!... -- View this message in context: http://www.nabble.com/Help-me...%21%21%21-tp23724167p23724167.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create all pairs
expand.grid(i,j) On Mon, May 25, 2009 at 8:26 PM, alad abhimanyu...@gmail.com wrote: Hi, I have: i = c(1,2,3) j = c(4,5,6) How do I create a matrix of all pairs? i.e. 1,4 1,5 1,6 2,4 : Thanks! -- View this message in context: http://www.nabble.com/How-to-create-all-pairs-tp23714659p23714659.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with replacing factors
This should work: levels(black_gray$gray)[levels(black_gray$gray)=='gray20'] = 'blue' On Sun, May 24, 2009 at 8:15 AM, Andreas Christoffersen achristoffer...@gmail.com wrote: Hi, In the example dataset below - how can I cahnge gray20, to blue # data black - rep(c(black,red),10) gray - rep(c(gray10,gray20),10) black_gray - data.frame(black,gray) # none of this desperate things works # replace(black_gray$gray, gray==gray20,red) # if(black_gray$gray==gray20){black_gray$gray-blue} # for (i in black_gray$gray)if(black_gray$gray[i]==gray20){black_gray$gray[i] -blue} # black_gray$gray==gray14 - blue # black_gray$gray[gray==gray20] - blue # subset(black_gray,gray==gray20,gray) -rep(blue,10) I have a feeling this is me misunderstanding some very basic stuf about the R engine... So any help will be much appreciated. Thanks in advance Andreas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coord_equal in ggplot2
If you use coord_equal on data where the range on the x-axis is larger than the range on the y-axis, then of course you'll observe extra space on the y-axis. What did you expect? Also, this post may be better suited to the ggplot2 mailing list: http://had.co.nz/ggplot2/ On Tue, May 19, 2009 at 7:17 AM, ONKELINX, Thierry thierry.onkel...@inbo.be wrote: Dear all, I'm plotting some points on a graph where both axes need to have the same scale. See the example below. Coord_equal does that trick but in this case it wastes a lot of space on the y-axis. Setting the limits of the y-axis myself was no avail. Any suggestions to solve this problem? library(ggplot2) ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000, min = 14, max = 26)) ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() + scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits = c(14, 26)) Regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MAC OSX vs Win XP: Different stats test results!
If mulrank does any sort of random number generation or non-exhaustive randomization, you should set the seed of the random number generator first: set.seed(1) mulrank(3,6,data03$x) On Mon, May 18, 2009 at 7:37 AM, Mareen mareenwe...@yahoo.com wrote: Hi all, I wondered whether anyone has some advice on a stats-related 'sanity check', as I ran a nonparametric multivariate test (mulrank function as decribed by R. Wilcox, 2005) on both systems, but got different results (please see below for the system-specific outputs)! The functions I used are attached as well. Any advice would be much appreciated! Thanks in advance for getting back to me! Best wishes, Mareen Mac: data03-selby2(data02, c(1,2), 3) mulrank(3,6,data03$x) $test.stat [1] 0.9331133 $nu1 [1] 11.46300 $p.value [,1] [1,] 0.509296 $N [1] 233 $q.hat [,1] [,2] [,3] [,4] [,5] [,6] [1,] 0.4940071 0.5256726 0.5176384 0.5476290 0.4690935 0.5265100 [2,] 0.5170627 0.4791950 0.5026431 0.4867843 0.4778865 0.5033497 [3,] 0.4680729 0.4944258 0.4889563 0.4505391 0.5311420 0.4726002 Win: mulrank(3,6, data03$x) $test.stat [1] 1.114665 $nu1 [1] 8.155991 $p.value [,1] [1,] 0.3491221 $N [1] 233 $q.hat [,1] [,2] [,3] [,4] [,5] [,6] [1,] 0.4940071 0.5406417 0.5236837 0.5656338 0.4771799 0.5324505 [2,] 0.5162776 0.4801895 0.5022244 0.4960745 0.4854234 0.4820737 [3,] 0.5013608 0.4920967 0.4810269 0.4482885 0.5326861 0.4871506 http://www.nabble.com/file/p23595008/Rallfun-v92.txt Rallfun-v92.txt -- View this message in context: http://www.nabble.com/MAC-OSX-vs-Win-XP%3A-Different-stats-test-results%21-tp23595008p23595008.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sample variance from simulation
why not simply vars=list() for (i in 1:1000) vars[[i]] = var(z[[i]]) On Mon, May 18, 2009 at 6:51 AM, Kon Knafelman konk2...@hotmail.com wrote: Hi, g=list() for(i in 1:1000){z[[i]]=rnorm(15,0,1)} I've attempted a similar problem based on the above method. Now, if i want to find the sample variance, do i go about it like this? for (i in 1:1000)vars[[i]] = sum(z[[i]]) vars[[i]] the overall sigma squared will just be 1, because the distribution is standard normal. Is this correct? if so, then to find (n-1)S^2/σ^2, i will need s=999*sum(vars[[i]]))/1? Is this correct, or am i getting lost along the way? Thank you Date: Wed, 13 May 2009 16:45:22 +0100 From: b.rowling...@lancaster.ac.uk To: csa...@rmki.kfki.hu CC: r-help@r-project.org Subject: Re: [R] Simulation On Wed, May 13, 2009 at 4:26 PM, Gábor Csárdi csa...@rmki.kfki.hu wrote: On Wed, May 13, 2009 at 5:13 PM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Why not? It took 0.05 seconds on my 5 years old laptop. Second-guessing the user, I think she maybe doesn't want to type in 'rnorm(100,0,1)' 1000 times... Soln - for loop: z=list() for(i in 1:1000){z[[i]]=rnorm(100,0,1)} now inspect the individual bits: hist(z[[1]]) hist(z[[545]]) If that's the problem, then I suggest she reads an introduction to R... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Looking to change your car this year? Find car news, reviews and more http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fsecure%2Dau%2Eimrworldwide%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sample variance from simulation
Ah, I thought this smelled like homework... Please read the R-help mailing list posting guide (http://www.r-project.org/posting-guide.html), specifically: Basic statistics and classroom homework: R-help is not intended for these. On Mon, May 18, 2009 at 10:35 AM, Kon Knafelman konk2...@hotmail.com wrote: Hey, when i type in either of those formulas into R, i dont really get the answer im looking for. For such large samples, isnt the sample variance meant to approach the actual variance, which is 1 for a standard normal? also, when i use sapply, i 1000 results for variance, where i think i just need one number. I've worked on this problem for so long. The initial problem is as follows Use the simulation capacity of R to generate m = 1 000 samples of size n = 15 from a N(0,1) distribution. Compute the statistic (n-1)S^2/σ^2 for the normally generated values, labelling as NC14. Produce probability histogram for NC14 and superimpose the theoretical distribution for a χ2 (14 degrees of freedom) g=list() for(i in 1:1000){z[[i]]=rnorm(15,0,1)} for (i in 1:1000)vars[[i]] = sum(z[[i]]) vars[[i]] sum(var(z[[i]])) [1] 0.9983413 Does this make sense? my logic is that i use the loop again to add up all the individual variances. im not really sure if i did it correctly, but if someone could make the necessary corrections, i'd be very very greatful. Thanks heaps guys for taking the time to look at this Date: Mon, 18 May 2009 15:06:47 +0200 From: waclaw.marcin.kusnierc...@idi.ntnu.no To: konk2...@hotmail.com CC: mike.lawre...@dal.ca; r-help@r-project.org Subject: Re: [R] sample variance from simulation Mike Lawrence wrote: why not simply vars=list() for (i in 1:1000) vars[[i]] = var(z[[i]]) ... or, much simpler, vars = sapply(z, var) vQ Let ninemsn property help Looking to move somewhere new this winter? -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MAC OSX vs Win XP: Different stats test results!
A good example of why manipulating data in external programs is dangerous. See a discussion of this at http://tolstoy.newcastle.edu.au/R/e6/help/09/05/13697.html On Mon, May 18, 2009 at 10:56 AM, Mareen Weber mareenwe...@yahoo.com wrote: Hello Peter, thanks for your quick response - you have hinted into the right direction. I triple-checked my data and - how embarrassing - I actually did have different data matrices. Due to a faulty Win Excel macro that sorted data into groups after data collection, I ended up with different data in my 3 groups in the Win test ... All good now :-). Thanks again! Best wishes, Mareen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read multiple large files into one dataframe
What types of data are in each file? All numbers, or a mix of numbers and characters? Any missing data or special NA values? On Wed, May 13, 2009 at 7:45 AM, SYKES, Jennifer jennifer.sy...@nats.co.uk wrote: Hello Apologies if this is a simple question, I have searched the help and have not managed to work out a solution. Does anybody know an efficient method for reading many text files of the same format into one table/dataframe? I have around 90 files that contain continuous data over 3 months but that are split into individual days data and I need the whole 3 months in one file for analysis. Each days file contains a large amount of data (approx 30MB each) and so I need a memory efficient method to merge all of the files into the one dataframe object. From what I have read I will probably want to avoid using for loops etc? All files are in the same directory, none have a header row, and each contain around 180,000 rows and the same 25 columns/variables. Any suggested packages/routines would be very useful. Thanks Jennifer - ***If you are not the intended recipient, please notify our Help Desk at Email postmas...@nats.co.uk immediately. You should not copy or use this email or attachment(s) for any purpose nor disclose their contents to any other person. NATS computer systems may be monitored and communications carried on them recorded, to secure the effective operation of the system and for other lawful purposes. Please note that neither NATS nor the sender accepts any responsibility for viruses or any losses caused as a result of viruses and it is your responsibility to scan or otherwise check this email and any attachments. NATS means NATS (En Route) plc (company number: 4129273), NATS (Services) Ltd (company number 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd (company number 3155567) or NATS Holdings Ltd (company number 4138218). All companies are registered in England and their registered office is at 5th Floor, Brettenham House South, Lancaster Place, London, WC2E 7EN. ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
If you want k samples of size n, why generate k*n samples and put them in a k-by-n matrix where you can do what you want to each sample: k = 10 n = 100 x=matrix(rnorm(k*n),k,n) rowMeans(x) If you need to do more complex things to each sample and if k is large enough that you don't want the matrix sitting around in memory while you do these things, you could also check out ?replicate . On Wed, May 13, 2009 at 12:13 PM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Can anyone please tell me how to generate a large number of samples in R, given certain distribution and size. For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? (Since I dont want to do rnorm(100,0,1) in R for 1000 times) Thanks for help Debbie _ Looking to change your car this year? Find car news, reviews and more e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kumaraswamy distribution
dkumar = function(x,a,b) a*b*(x^(a-1))*((1-x^a)^(b-1)) pkumar = function(x,a,b) 1-(1-x^a)^b (I've based this entirely on the wikipedia entry on the Kumaraswamy distribution [http://en.wikipedia.org/wiki/Kumaraswamy_distribution], so best to check both my replication of the formula there and the accuracy of the wikipedia entry itself) On Tue, May 12, 2009 at 6:37 AM, Debbie Zhang debbie0...@hotmail.com wrote: Dear R users, Does anyone know how to write function for Kumaraswamy distribution in R? Since I cannot write dkumar, pkumar, etc. in R. Please help. Thanks a lot, Debbie _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two-way Anova
Using traditional ANOVA, you'd have to drop either cases or time points with missing data. Using linear mixed effects analysis, you'd be able to use all the data. LME also has the benefit of *not* assuming sphericity, which is good for data like yours (many measures across few cases) where the traditional ANOVA sphericity assumption is unlikely to hold. Your dependent variable, % valid, suggests that there's some more raw representation of the data that may be better to look at. For example, if % valid is derived from, say, the success/failure rate of 10 observations per sample/timepoint, you might want to take a look the lme4 package (as suggested in a previous post: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q3/001160.html ) On Tue, May 12, 2009 at 6:03 AM, Alan O'Loughlin olo...@wyeth.com wrote: Hello, I'm trying to do a comparsion on a large scale say 10L bottle of liquid and a small scale bottle of liquid 0.5L, I have 5 different samples from each and they are measured over the space of 8 days as % viability and the % viability decreases over time. However not all 10 samples got measured every day. How would I do a two-way anova on this in R? Thanks for any help. Regards, Al [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Do you use R for data manipulation?
I work in cognitive science where we collect one or more data files per participant in an experiment then merge those files to perform subsequent analyses. Sometimes some files are in wide format and others are in long format, necessitating reshaping. I've found R entirely satisfactory for this.* Indeed, I would be wary of an approach that attempts data manipulation *outside* of R as I'm of the raw data in, results out school of thought that it's dangerous to isolate your data manipulation from your record of analysis. If you leave your raw data files untouched and perform all manipulation analysis in one system (R) then there is a complete record of what's happened to the data from start to finish and it's easier to catch/correct errors. The reshape package is great for reshaping between long wide data formats, and the ply package is great for computing summary statistics within cells of the design. Mike *note: I typically use Python for data collection (showing visual stimuli, recording responses, etc), but have it spit out raw text files of the trial-by-trial data, and thus use it for only a bare minimum of processing. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] General Question about Graphics
Check out ggplot2: http://had.co.nz/ggplot2/ Particularly the book: http://had.co.nz/ggplot2/book For error bars on bar plots specifically: http://had.co.nz/ggplot2/geom_errorbar.html Mike On Wed, May 6, 2009 at 12:34 PM, steve_fried...@nps.gov wrote: Greetings I have encountered a situation with regards to plotting barcharts with associated error bars. My search for clues on how to accomplish this turned up some interesting information. Basically, I found that including error bars with barplots is not desirable and hence there appears that there is no function to do this. Can someone offer suggestions on how to do this simple procedure or offer an alternative? Thanks Steve Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bumps chart in R
(cross posting to the ggplot2 group for posterity) Here's how I'd approach it: library(ggplot2) text = letters[1:20] tal1 = rnorm (20,5,2) tal2 = rnorm (20,6,3) dif = tal2-tal1 df0 = data.frame(text,tal1,tal2) df = melt( data = df0 , id.vars = 'text' , variable_name = 'tal' ) df$dif = dif df$col = ifelse(dif0,'red',ifelse(dif0,'blue','black')) df$size = abs(dif) # draw the plot ggplot( data=df , aes( x=tal , y=value , group=text ) ) + geom_line( aes( size=size , colour=col ) )+ geom_text( aes( label=text ) )+ opts( legend.position=none )+ theme_bw() Unfortunately it's not perfect: (1) col isn't being interpreted literally, so instead of truly red blue. (2) the lines ends are square whereas usually they're rounded. (3) attempting to remove the legend via opts(legend.position=none) seems to fail. On Wed, May 6, 2009 at 6:44 PM, Andreas Christoffersen achristoffer...@gmail.com wrote: Okay - first off: Thank you all for you kind help so far. (Especially to hadly for writing ggplot2). I feel that I am beginning to understand the grammar of graphics - but obviously still have a long way to go. Some of the road I need to travel has to do with basic R. Anyway ; instead of wrting a new post - I thought it best to post in the current topic. please correct me if I am wrong. getting to the point: I have expanded upon the original bumps chart (or what ever type of chart it is). I'd like the line width to be proportional to the change between time 1 and time 2. Also: I'd like colous to be red if the change is negative, and black if the change is positive. My solutions produces some problems for me: library(ggplot2) # Loads ggplot2 text - letters[1:20] # letters is proxy for categories tal1 - rnorm (20,5,2) # random numbers for first tal2 - rnorm (20,6,3) # random numbers for second dif - tal2-tal1 # difference between second and first df0 - cbind(tal1,tal2,dif) # do dataframe df - melt(df0) # melt farve - c(2,1,1,2,2,1,2,2,2,2,1,1,1,2,2,1,1,1,2,2) # define colours - black for positive change, red for negative change # these colours I handcode - depending on the random generated - so it will not fit new data. # draw the plot qplot(X2, value,data=df, group=X1,geom=blank)+ geom_line(aes(group=X1),subset(df,X2!=dif),size=scale(abs(subset(df,df$X2==dif)$value),center=F,scale=T)[,1],colour=farve)+ geom_text(aes(label=X1),subset(df,X2==tal2),size=3,hjust=-3,vjust=1)+theme_bw() # My questions: # How to do colours automaticaly # how to remove dif from the X axis? - subset doesn't seem to work? - eg # qplot(subset(df,X2!=dif,X2, drop=T), value,data=df) - returns an error. # Hot to use melt better - so that text becomes the id? id=text doesn't work. thanks in advance On Tue, Apr 28, 2009 at 12:09 AM, Andreas Christoffersen achristoffer...@gmail.com wrote: My legend is removed! - Couldn't find it in your ggplot2 book - but here it is. Brilliant - thank you very much. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot estimates and their 95% confidence intervals
Take a look at ggplot2: http://had.co.nz/ggplot2/ http://had.co.nz/ggplot2/book/ Particularly: http://had.co.nz/ggplot2/geom_point.html http://had.co.nz/ggplot2/geom_errorbar.html http://had.co.nz/ggplot2/coord_flip.html ex: p = ggplot(my_data,aes(x=state,y=ami_mean)) p = p + geom_point() p = p + geom_errorbar(aes(ymin=ami_low,ymax=ami_up)) p = p + coord_flip() print(p) On Mon, Apr 27, 2009 at 6:00 PM, He, Yulei h...@hcp.med.harvard.edu wrote: Hi, there: I have a dataset with 50 states and for each state, I have its associated mean estimate (for some parameters) and the lower and upper bound of the 95% CI. The data look like below: state ami_mean ami_low ami_up 1 MS -0.58630 -0.90720 -0.29580 2 KY -0.48100 -0.75990 -0.19470 3 FL -0.47900 -0.62930 -0.32130 I would like to have a plot the 95% CI (characterized by the mean, lower, and upper bound, and the lines linking them) for each state, with x-axis is the parameter estimate, and y-axis is the state. What kind of plot functions can I use? Thanks. Yulei [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bumps chart in R
Here's a ggplot2 based solution: #load the ggplot2 library library(ggplot2) #here's the data provided by Andreas countries - c(U-lande, Afrika syd for sahara, Europa og Centralasien, Lantinamerika og Caribien,Mellemøstenog Nordafrika,Sydasien,ØStasien og stillehaveet, Kina, Brasilien) poor_1990 - c(28.7,46.7,0.5,10.2,2.3,43,29.8,33,14) poor_2004 - c(18.1,41.1,0.9,8.6,1.5,30.8,9.1,9.9,7.5) #reformat the data data = data.frame(countries,poor_1990,poor_2004) data = melt(data,id=c('countries'),variable_name='year') levels(data$year) = c('1990','2004') #make a new column to make the text justification easier data$hjust = 1-(as.numeric(data$year)-1) #start the percentage plot p = ggplot( data ,aes( x=year ,y=value ,groups=countries ) ) #add the axis labels p = p + labs( x = '\nYear' , y = '%\n' ) #add lines p = p + geom_line() #add the text p = p + geom_text( aes( label=countries , hjust = hjust ) ) #expand the axis to fit the text p = p + scale_x_discrete( expand=c(2,2) ) #show the plot print(p) #rank the countries data$rank = NA data$rank[data$year=='1990'] = rank(data$value[data$year=='1990']) data$rank[data$year=='2004'] = rank(data$value[data$year=='2004']) #start the rank plot r = ggplot( data ,aes( x=year ,y=rank ,groups=countries ) ) #add the axis labels r = r + labs( x = '\nYear' , y = 'Rank\n' ) #add the lines r = r + geom_line() #add the text r = r + geom_text( aes( label=countries , hjust = hjust ) ) #expand the axis to fit the text r = r + scale_x_discrete( expand=c(2,2) ) #show the plot print(r) -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot of two variables
Check out ggplot2: http://had.co.nz/ggplot2 especially: http://had.co.nz/ggplot2/geom_boxplot.html But you are strongly advised to read the book: http://had.co.nz/ggplot2/book/ On Thu, Apr 23, 2009 at 12:13 PM, Gabriel R. Rodriguez garo...@cnia.inta.gov.ar wrote: Hello ! I have a dataframe with 6 variables (A1,A2,B1,B2,C1,C2) and 1 factor (F). I would like to produce a graph consisting of 3 boxplots sets, one for every two variables (i.e A1 A2) by the factor (F). I was looking around and I cannot figure it out, any suggestions? Best Regards, Gabriel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sampling in R
-sort(Perm.Cor, decreasing=TRUE) if(mat.cor0) Perm.Cor.Sorted-sort(Perm.Cor, decreasing=FALSE) T95-Perm.Cor.Sorted[(nperm+1)*0.05] # 95% treshold value T99-Perm.Cor.Sorted[(nperm+1)*0.01] # 99% treshold value I want to understand where I am making a mistake. Any comment is deeply appreciated. Kind Regards Seyit Ali -- Dr. Seyit Ali KAYIS Selcuk University Faculty of Agriculture Kampus, Konya, TURKEY s_a_ka...@yahoo.com, s_a_ka...@hotmail.com Tell: +90 332 223 2830 Mobile: +90 535 587 1139 Fax: +90 332 241 0108 Greetings from Konya, TURKEY http://www.ziraat.selcuk.edu.tr/skayis/ -- _ Earning enough? Find out with SEEK Salary Survey %2Eco%2Enz%2F%3Ftracking%3Dsk%3Atl%3Asknzsal%3Amsnnz%3A0%3Ahottag%3Aearn%5Fenough_t=757263783_r=Seek_NZ_tagline_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] OT: Strange bootstrap approach to significance testing
Hi all, I'm reviewing a paper that employs a strange (to me) approach to a non-parametric significance testing. I'm familiar with permutation tests and bootstrapping confidence intervals around parameter estimates, but here they seem to be bootstrapping CIs for a manufactured Null F-value. They have a simple 3 groups between-Ss design and compute the F-value of the observed data. Then, they re-center each group's mean to zero then repeatedly: randomly resample values from each group (with replacement, doubling each group's size) and re-compute the F-value. The distribution of re-centered/resampled F-values is then used as a reference distribution within which the percentile of the observed F-value is computed. They determine that the observed F-value is outside the 95% confidence interval of the re-centered/resampled F-values and conclude a significant effect of group. In my head this makes some sense (though I'd think a straight permutation test would be a lot simpler), but having never heard of anything like this before I thought I'd see what others on this list think of the approach. Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paired test for 2-way ANOVA
Paired (aka. Repeated measures, aka. within-Ss) tests can be achieved by using aov() and specifying the within-Ss effect in the error term: my_aov = aov( dependent_variable~between_Ss_variable*within_Ss_variable + Error(Ss_id/(within_Ss_variable))) summary(my_aov) On Mon, Apr 20, 2009 at 10:25 AM, A Ezhil ezhi...@yahoo.com wrote: Hi, I have an experimental data with 2 factors: visit and treatment. Each subject has 2 visits and on each visit they get a treatment. We take samples before and after treatment. I can easily to do the 2-way ANOVA in R with visit and treatment as factors. anova( lm(data ~ visit*treatment) ) If I want to do the same 2-way ANOVA for the paired samples, how can I do that? Is there any R packages for that? Thanks in advance. Kind regards, Ezhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Welcome to the R-help mailing list
When plotting in loops, you need to wrap your plot call in print(). On Sat, Apr 18, 2009 at 2:14 PM, Qifei Zhu zhu_qi...@yahoo.com.sg wrote: Hi all, I'm a newbie R developer, am trying to dotplot a few graphs using a for loop. The following code works fine but once I wanna plot inside a loop, nothing happens. for(i in 1:1){dotplot(y~x)} y - c(1,2,3) x - c('a','b','c') dotplot(y~x) for (i in 1:3) {dotplot(y~x)} (y and x depends on I in actual case) Nothing happens. I appreciate your advice on what is going wrong? Thanks. Best, Tony __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] performing function on data frame
As Michael notes, scale is what you want. Also, your request has an incorrect definition of z scores: d$y - mean(d$y)/sd(d$y) #incorrect (d$y - mean(d$y) ) / sd(d$y) #correct On Thu, Apr 16, 2009 at 7:40 AM, Michael Conklin michael.conk...@markettools.com wrote: newDF-as.data.frame(scale(oldDF)) see ?scale Hope that helps. Michael Conklin -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Karin Lagesen Sent: Thursday, April 16, 2009 5:29 AM To: r-help@r-project.org Subject: Re: [R] performing function on data frame David Hajage dhajag...@gmail.com writes: Hi Karin, I'm not sure I understand... Is this what you want ? d$y - mean(d$y)/sd(d$y) Yes, and also a bit no. Each column in my data frame represents one data set. For every element in this data set I want to know the z value for that element. I.e: I want to create a new data frame from the old data frame, where each element in the new data frame is newDF[i,j] = oldDF[i,j] - mean(d[,j]) / sddev(d[,j]) I could, I think, iterate like this over the data frame, but I keep thinking that one of the apply functions should be employed... Karin -- Karin Lagesen, Ph.D. karin.lage...@medisin.uio.no http://folk.uio.no/karinlag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] F test
summary(my_lm) will give you t-values, anova(my_lm) will give you (equivalent) F-values. summary() might be preferred because it also provides the estimates SE. a=data.frame(dv=rnorm(10),iv1=rnorm(10),iv2=rnorm(10)) my_lm=lm(dv~iv1*iv2,a) summary(my_lm) Call: lm(formula = dv ~ iv1 * iv2, data = a) Residuals: Min 1Q Median 3Q Max -1.8484 -0.2059 0.1627 0.4623 1.0401 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.4864 0.4007 -1.2140.270 iv1 0.8233 0.5538 1.4870.188 iv2 0.2314 0.3863 0.5990.571 iv1:iv2 -0.4110 0.5713 -0.7190.499 Residual standard error: 1.017 on 6 degrees of freedom Multiple R-squared: 0.3161, Adjusted R-squared: -0.02592 F-statistic: 0.9242 on 3 and 6 DF, p-value: 0.4842 anova(my_lm) Analysis of Variance Table Response: dv Df Sum Sq Mean Sq F value Pr(F) iv11 1.9149 1.9149 1.8530 0.2223 iv21 0.4156 0.4156 0.4021 0.5494 iv1:iv21 0.5348 0.5348 0.5175 0.4990 Residuals 6 6.2004 1.0334 On Thu, Apr 16, 2009 at 10:35 AM, kayj kjaj...@yahoo.com wrote: Hi, How can I find the p-value for the F test for the interaction terms in a regression linear model lm ? I appreciate your help -- View this message in context: http://www.nabble.com/F-test-tp23078122p23078122.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] F test
I'm new to LME myself, so it would be best for others to advise on this. On Thu, Apr 16, 2009 at 3:00 PM, Jun Shen jun.shen...@gmail.com wrote: Mike, I kind of have the same question. What if for a mixed effect model, say using lme(), how to specify the interaction effect (between a fixed effect and a random effect)? and where to find the result of the interaction? Thanks. Jun On Thu, Apr 16, 2009 at 12:08 PM, Mike Lawrence mike.lawre...@dal.ca wrote: summary(my_lm) will give you t-values, anova(my_lm) will give you (equivalent) F-values. summary() might be preferred because it also provides the estimates SE. a=data.frame(dv=rnorm(10),iv1=rnorm(10),iv2=rnorm(10)) my_lm=lm(dv~iv1*iv2,a) summary(my_lm) Call: lm(formula = dv ~ iv1 * iv2, data = a) Residuals: Min 1Q Median 3Q Max -1.8484 -0.2059 0.1627 0.4623 1.0401 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.4864 0.4007 -1.214 0.270 iv1 0.8233 0.5538 1.487 0.188 iv2 0.2314 0.3863 0.599 0.571 iv1:iv2 -0.4110 0.5713 -0.719 0.499 Residual standard error: 1.017 on 6 degrees of freedom Multiple R-squared: 0.3161, Adjusted R-squared: -0.02592 F-statistic: 0.9242 on 3 and 6 DF, p-value: 0.4842 anova(my_lm) Analysis of Variance Table Response: dv Df Sum Sq Mean Sq F value Pr(F) iv1 1 1.9149 1.9149 1.8530 0.2223 iv2 1 0.4156 0.4156 0.4021 0.5494 iv1:iv2 1 0.5348 0.5348 0.5175 0.4990 Residuals 6 6.2004 1.0334 On Thu, Apr 16, 2009 at 10:35 AM, kayj kjaj...@yahoo.com wrote: Hi, How can I find the p-value for the F test for the interaction terms in a regression linear model lm ? I appreciate your help -- View this message in context: http://www.nabble.com/F-test-tp23078122p23078122.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] F test
Ahem. Equivalent, my tired foot... My bad, I wasn't paying attention. May I suggest consulting a textbook *before* flunking ANOVA 101 ? Harsh but warranted given my carelessness. On Thu, Apr 16, 2009 at 3:47 PM, Emmanuel Charpentier charp...@bacbuc.dyndns.org wrote: Le jeudi 16 avril 2009 à 14:08 -0300, Mike Lawrence a écrit : summary(my_lm) will give you t-values, anova(my_lm) will give you (equivalent) F-values. Ahem. Equivalent, my tired foot... In simple terms (the real real story may be more intricate) : The F values stated by anova are something entierely different of t values in summary. The latter allow you to assess properties of *one* coefficient in your model (namely, do I have enough suport to state that it is nonzero ?). The former allows you to assess whether you have support for stating that *ALL* the coefficient related to the same factor cannot be *SIMULTANEOUSLY* null. Which is a horse of quite another color... By the way : if your summary indeed does give you the mean^K^K an unbiased estimate of your coefficient and an (hopefully) unbiased estimate of its standard error, the F ration is the ratio of estimates of remaining variabilities with and without the H0 assumption it tests, that is that *ALL* coefficients of your factor of interest are *SIMULTANEOUSLY* null. F and t numbers will be equivalent if and only if your factor of interest needs only one coefficient to get expressed, i. e. is a continuous covariable or a two-class discrete variable (such as boolean). In this case, you can test your factor either by the t value which, under H0, fluctuates as a Student's t with n_res dof (n_res being the residual degrees of freedom of the model) or by the F value, which will fluctuate as a Fisher F statistic with 1 and n_res dof, which happens (but that's not happenstance...) to be the *square* of a t with n_dof. May I suggest consulting a textbook *before* flunking ANOVA 101 ? Emmanuel Charpentier summary() might be preferred because it also provides the estimates SE. a=data.frame(dv=rnorm(10),iv1=rnorm(10),iv2=rnorm(10)) my_lm=lm(dv~iv1*iv2,a) summary(my_lm) Call: lm(formula = dv ~ iv1 * iv2, data = a) Residuals: Min 1Q Median 3Q Max -1.8484 -0.2059 0.1627 0.4623 1.0401 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.4864 0.4007 -1.214 0.270 iv1 0.8233 0.5538 1.487 0.188 iv2 0.2314 0.3863 0.599 0.571 iv1:iv2 -0.4110 0.5713 -0.719 0.499 Residual standard error: 1.017 on 6 degrees of freedom Multiple R-squared: 0.3161, Adjusted R-squared: -0.02592 F-statistic: 0.9242 on 3 and 6 DF, p-value: 0.4842 anova(my_lm) Analysis of Variance Table Response: dv Df Sum Sq Mean Sq F value Pr(F) iv1 1 1.9149 1.9149 1.8530 0.2223 iv2 1 0.4156 0.4156 0.4021 0.5494 iv1:iv2 1 0.5348 0.5348 0.5175 0.4990 Residuals 6 6.2004 1.0334 On Thu, Apr 16, 2009 at 10:35 AM, kayj kjaj...@yahoo.com wrote: Hi, How can I find the p-value for the F test for the interaction terms in a regression linear model lm ? I appreciate your help -- View this message in context: http://www.nabble.com/F-test-tp23078122p23078122.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to apply a function to all combinations of the values of 2 or more variables?
Check out plyr: http://had.co.nz/plyr/ On Wed, Apr 15, 2009 at 2:03 PM, Lane, Jim jim.l...@rbc.com wrote: Hi, All Forgive me if this is a stupid newbie question. I'm having no luck googling an answer to this, probably because I don't know the right R terminology to frame my question. I want to know how to run an R function on each combination of the values of 2 or more variables. In SAS-speak this is multiple BY variables or CLASS variables. In R I've figured out how to do what I want with one by value. by(mf,mf$centre,summary) Centre is one of the columns in the data frame mf which looks like this: names(mf) [1] centre complex appl pool month alloc_gb I'd like to analyze, for example, by complex within centre. This has a manageable number of combinations: table(mf$centre,mf$complex) A B C D E F G H I J K L B 0 0 0 0 0 0 0 0 0 0 0 60574 0 G 44 209 0 94613 0 156 0 2541 0 748 0 0 215511 O 0 0 0 0 0 0 3446 0 676 0 84400 0 0 Q 0 0 298 0 66277 0 0 0 0 0 0 0 0 In SAS what I'm after is something like: Proc summary nway; class centre complex; var alloc_gb; output out=s sum=; run; How do I get something similar in R? Jim Lane Capacity Planner RBC Financial Group 315 Front St W 6th Floor - H14 Toronto, Ontario CANADA M5V 3A4 416-348-6024 Display of superior knowledge is as great a vulgarity as display of superior wealth - greater indeed, inasmuch as knowledge should tend more definitely than wealth towards discretion and good manners. - H. W. Fowler, Modern English Usage ___ This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately. Ce courrier électronique est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courrier électronique par erreur, veuillez m'en aviser immédiatement, par retour de courrier électronique ou par un autre moyen. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind
a=c(NA,NA,2,3,NA,NA,NA) which.min(is.na(a)) [1] 3 b=1:length(a) b-3 [1] -2 -1 0 1 2 3 4 cbind(b=b-3,a=a) b a [1,] -2 NA [2,] -1 NA [3,] 0 2 [4,] 1 3 [5,] 2 NA [6,] 3 NA [7,] 4 NA On Tue, Apr 14, 2009 at 7:20 AM, emj83 stp08...@shef.ac.uk wrote: I have a list of numbers with NAs as below: A[,1] [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [19] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [37] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [55] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [73] NA NA NA 62 78 98 73 57 63 56 88 77 151 165 129 78 83 63 [91] 72 68 61 89 95 74 53 77 90 106 114 113 84 59 60 77 46 91 [109] 108 111 76 75 70 61 65 61 52 94 71 67 52 86 79 97 80 101 [127] 87 53 85 79 86 104 153 128 155 148 NA NA NA NA NA NA NA NA [145] NA NA NA NA NA NA NA I would like to bind a column to this list which begins at 0 when the first number has occured, but provides negative numbers prior to this as below: [1,] -75 NA [2,] -74 NA [3,] -73 NA [4,] -72 NA [5,] -71 NA [6,] -70 NA [7,] -69 NA [8,] -68 NA [9,] -67 NA [10,] -66 NA [11,] -65 NA [12,] -64 NA [13,] -63 NA [14,] -62 NA [15,] -61 NA [16,] -60 NA [17,] -59 NA [18,] -58 NA [19,] -57 NA [20,] -56 NA [21,] -55 NA [22,] -54 NA [23,] -53 NA [24,] -52 NA [25,] -51 NA [26,] -50 NA [27,] -49 NA [28,] -48 NA [29,] -47 NA [30,] -46 NA [31,] -45 NA [32,] -44 NA [33,] -43 NA [34,] -42 NA [35,] -41 NA [36,] -40 NA [37,] -39 NA [38,] -38 NA [39,] -37 NA [40,] -36 NA [41,] -35 NA [42,] -34 NA [43,] -33 NA [44,] -32 NA [45,] -31 NA [46,] -30 NA [47,] -29 NA [48,] -28 NA [49,] -27 NA [50,] -26 NA [51,] -25 NA [52,] -24 NA [53,] -23 NA [54,] -22 NA [55,] -21 NA [56,] -20 NA [57,] -19 NA [58,] -18 NA [59,] -17 NA [60,] -16 NA [61,] -15 NA [62,] -14 NA [63,] -13 NA [64,] -12 NA [65,] -11 NA [66,] -10 NA [67,] -9 NA [68,] -8 NA [69,] -7 NA [70,] -6 NA [71,] -5 NA [72,] -4 NA [73,] -3 NA [74,] -2 NA [75,] -1 NA [76,] 0 62 [77,] 1 78 [78,] 2 98 [79,] 3 73 [80,] 4 57 [81,] 5 63 [82,] 6 56 [83,] 7 88 [84,] 8 77 [85,] 9 151 [86,] 10 165 [87,] 11 129 [88,] 12 78 [89,] 13 83 [90,] 14 63 [91,] 15 72 [92,] 16 68 [93,] 17 61 [94,] 18 89 [95,] 19 95 [96,] 20 74 [97,] 21 53 [98,] 22 77 [99,] 23 90 [100,] 24 106 [101,] 25 114 [102,] 26 113 [103,] 27 84 [104,] 28 59 [105,] 29 60 [106,] 30 77 [107,] 31 46 [108,] 32 91 [109,] 33 108 [110,] 34 111 [111,] 35 76 [112,] 36 75 [113,] 37 70 [114,] 38 61 [115,] 39 65 [116,] 40 61 [117,] 41 52 [118,] 42 94 [119,] 43 71 [120,] 44 67 [121,] 45 52 [122,] 46 86 [123,] 47 79 [124,] 48 97 [125,] 49 80 [126,] 50 101 [127,] 51 87 [128,] 52 53 [129,] 53 85 [130,] 54 79 [131,] 55 86 [132,] 56 104 [133,] 57 153 [134,] 58 128 [135,] 59 155 [136,] 60 148 [137,] 61 NA [138,] 62 NA [139,] 63 NA [140,] 64 NA [141,] 65 NA [142,] 66 NA [143,] 67 NA [144,] 68 NA [145,] 69 NA [146,] 70 NA [147,] 71 NA [148,] 72 NA [149,] 73 NA [150,] 74 NA [151,] 75 NA could anyone help me to with a function that would be able to calculate the sequence I require to bind to the initial sequence? thanks emma -- View this message in context: http://www.nabble.com/cbind-tp23036759p23036759.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind
oops, of course I meant: a=c(NA,NA,2,3,NA,NA,NA) b=1:length(a) cbind(b=b-which.min(is.na(a)),a=a) On Tue, Apr 14, 2009 at 9:43 AM, Mike Lawrence mike.lawre...@dal.ca wrote: a=c(NA,NA,2,3,NA,NA,NA) which.min(is.na(a)) [1] 3 b=1:length(a) b-3 [1] -2 -1 0 1 2 3 4 cbind(b=b-3,a=a) b a [1,] -2 NA [2,] -1 NA [3,] 0 2 [4,] 1 3 [5,] 2 NA [6,] 3 NA [7,] 4 NA On Tue, Apr 14, 2009 at 7:20 AM, emj83 stp08...@shef.ac.uk wrote: I have a list of numbers with NAs as below: A[,1] [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [19] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [37] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [55] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [73] NA NA NA 62 78 98 73 57 63 56 88 77 151 165 129 78 83 63 [91] 72 68 61 89 95 74 53 77 90 106 114 113 84 59 60 77 46 91 [109] 108 111 76 75 70 61 65 61 52 94 71 67 52 86 79 97 80 101 [127] 87 53 85 79 86 104 153 128 155 148 NA NA NA NA NA NA NA NA [145] NA NA NA NA NA NA NA I would like to bind a column to this list which begins at 0 when the first number has occured, but provides negative numbers prior to this as below: [1,] -75 NA [2,] -74 NA [3,] -73 NA [4,] -72 NA [5,] -71 NA [6,] -70 NA [7,] -69 NA [8,] -68 NA [9,] -67 NA [10,] -66 NA [11,] -65 NA [12,] -64 NA [13,] -63 NA [14,] -62 NA [15,] -61 NA [16,] -60 NA [17,] -59 NA [18,] -58 NA [19,] -57 NA [20,] -56 NA [21,] -55 NA [22,] -54 NA [23,] -53 NA [24,] -52 NA [25,] -51 NA [26,] -50 NA [27,] -49 NA [28,] -48 NA [29,] -47 NA [30,] -46 NA [31,] -45 NA [32,] -44 NA [33,] -43 NA [34,] -42 NA [35,] -41 NA [36,] -40 NA [37,] -39 NA [38,] -38 NA [39,] -37 NA [40,] -36 NA [41,] -35 NA [42,] -34 NA [43,] -33 NA [44,] -32 NA [45,] -31 NA [46,] -30 NA [47,] -29 NA [48,] -28 NA [49,] -27 NA [50,] -26 NA [51,] -25 NA [52,] -24 NA [53,] -23 NA [54,] -22 NA [55,] -21 NA [56,] -20 NA [57,] -19 NA [58,] -18 NA [59,] -17 NA [60,] -16 NA [61,] -15 NA [62,] -14 NA [63,] -13 NA [64,] -12 NA [65,] -11 NA [66,] -10 NA [67,] -9 NA [68,] -8 NA [69,] -7 NA [70,] -6 NA [71,] -5 NA [72,] -4 NA [73,] -3 NA [74,] -2 NA [75,] -1 NA [76,] 0 62 [77,] 1 78 [78,] 2 98 [79,] 3 73 [80,] 4 57 [81,] 5 63 [82,] 6 56 [83,] 7 88 [84,] 8 77 [85,] 9 151 [86,] 10 165 [87,] 11 129 [88,] 12 78 [89,] 13 83 [90,] 14 63 [91,] 15 72 [92,] 16 68 [93,] 17 61 [94,] 18 89 [95,] 19 95 [96,] 20 74 [97,] 21 53 [98,] 22 77 [99,] 23 90 [100,] 24 106 [101,] 25 114 [102,] 26 113 [103,] 27 84 [104,] 28 59 [105,] 29 60 [106,] 30 77 [107,] 31 46 [108,] 32 91 [109,] 33 108 [110,] 34 111 [111,] 35 76 [112,] 36 75 [113,] 37 70 [114,] 38 61 [115,] 39 65 [116,] 40 61 [117,] 41 52 [118,] 42 94 [119,] 43 71 [120,] 44 67 [121,] 45 52 [122,] 46 86 [123,] 47 79 [124,] 48 97 [125,] 49 80 [126,] 50 101 [127,] 51 87 [128,] 52 53 [129,] 53 85 [130,] 54 79 [131,] 55 86 [132,] 56 104 [133,] 57 153 [134,] 58 128 [135,] 59 155 [136,] 60 148 [137,] 61 NA [138,] 62 NA [139,] 63 NA [140,] 64 NA [141,] 65 NA [142,] 66 NA [143,] 67 NA [144,] 68 NA [145,] 69 NA [146,] 70 NA [147,] 71 NA [148,] 72 NA [149,] 73 NA [150,] 74 NA [151,] 75 NA could anyone help me to with a function that would be able to calculate the sequence I require to bind to the initial sequence? thanks emma -- View this message in context: http://www.nabble.com/cbind-tp23036759p23036759.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained
Re: [R] cbind
Unfortunately Dimitris' solution fails in the face of NA padding on both sides of the numeric data, as in Emma's original example. x - c(rep(NA, 20), sample(100, 25), rep(NA,20)) On Tue, Apr 14, 2009 at 9:40 AM, Dimitris Rizopoulos d.rizopou...@erasmusmc.nl wrote: try this: x - c(rep(NA, 20), sample(100, 25)) n.na - sum(is.na(x)) cbind(seq(-n.na, length(x) - n.na - 1), x) I hope it helps. Best, Dimitris emj83 wrote: I have a list of numbers with NAs as below: A[,1] [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [19] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [37] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [55] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA [73] NA NA NA 62 78 98 73 57 63 56 88 77 151 165 129 78 83 63 [91] 72 68 61 89 95 74 53 77 90 106 114 113 84 59 60 77 46 91 [109] 108 111 76 75 70 61 65 61 52 94 71 67 52 86 79 97 80 101 [127] 87 53 85 79 86 104 153 128 155 148 NA NA NA NA NA NA NA NA [145] NA NA NA NA NA NA NA I would like to bind a column to this list which begins at 0 when the first number has occured, but provides negative numbers prior to this as below: [1,] -75 NA [2,] -74 NA [3,] -73 NA [4,] -72 NA [5,] -71 NA [6,] -70 NA [7,] -69 NA [8,] -68 NA [9,] -67 NA [10,] -66 NA [11,] -65 NA [12,] -64 NA [13,] -63 NA [14,] -62 NA [15,] -61 NA [16,] -60 NA [17,] -59 NA [18,] -58 NA [19,] -57 NA [20,] -56 NA [21,] -55 NA [22,] -54 NA [23,] -53 NA [24,] -52 NA [25,] -51 NA [26,] -50 NA [27,] -49 NA [28,] -48 NA [29,] -47 NA [30,] -46 NA [31,] -45 NA [32,] -44 NA [33,] -43 NA [34,] -42 NA [35,] -41 NA [36,] -40 NA [37,] -39 NA [38,] -38 NA [39,] -37 NA [40,] -36 NA [41,] -35 NA [42,] -34 NA [43,] -33 NA [44,] -32 NA [45,] -31 NA [46,] -30 NA [47,] -29 NA [48,] -28 NA [49,] -27 NA [50,] -26 NA [51,] -25 NA [52,] -24 NA [53,] -23 NA [54,] -22 NA [55,] -21 NA [56,] -20 NA [57,] -19 NA [58,] -18 NA [59,] -17 NA [60,] -16 NA [61,] -15 NA [62,] -14 NA [63,] -13 NA [64,] -12 NA [65,] -11 NA [66,] -10 NA [67,] -9 NA [68,] -8 NA [69,] -7 NA [70,] -6 NA [71,] -5 NA [72,] -4 NA [73,] -3 NA [74,] -2 NA [75,] -1 NA [76,] 0 62 [77,] 1 78 [78,] 2 98 [79,] 3 73 [80,] 4 57 [81,] 5 63 [82,] 6 56 [83,] 7 88 [84,] 8 77 [85,] 9 151 [86,] 10 165 [87,] 11 129 [88,] 12 78 [89,] 13 83 [90,] 14 63 [91,] 15 72 [92,] 16 68 [93,] 17 61 [94,] 18 89 [95,] 19 95 [96,] 20 74 [97,] 21 53 [98,] 22 77 [99,] 23 90 [100,] 24 106 [101,] 25 114 [102,] 26 113 [103,] 27 84 [104,] 28 59 [105,] 29 60 [106,] 30 77 [107,] 31 46 [108,] 32 91 [109,] 33 108 [110,] 34 111 [111,] 35 76 [112,] 36 75 [113,] 37 70 [114,] 38 61 [115,] 39 65 [116,] 40 61 [117,] 41 52 [118,] 42 94 [119,] 43 71 [120,] 44 67 [121,] 45 52 [122,] 46 86 [123,] 47 79 [124,] 48 97 [125,] 49 80 [126,] 50 101 [127,] 51 87 [128,] 52 53 [129,] 53 85 [130,] 54 79 [131,] 55 86 [132,] 56 104 [133,] 57 153 [134,] 58 128 [135,] 59 155 [136,] 60 148 [137,] 61 NA [138,] 62 NA [139,] 63 NA [140,] 64 NA [141,] 65 NA [142,] 66 NA [143,] 67 NA [144,] 68 NA [145,] 69 NA [146,] 70 NA [147,] 71 NA [148,] 72 NA [149,] 73 NA [150,] 74 NA [151,] 75 NA could anyone help me to with a function that would be able to calculate the sequence I require to bind to the initial sequence? thanks emma -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group by in R
One way: g= paste(f1,f2,f3,f4) table(g) On Mon, Apr 13, 2009 at 7:33 AM, Nick Angelou nikola...@yahoo.com wrote: Hi, I have the following table data: f1, f2, f3, f4. I want to compute the counts of unique combinations of f1-f4. In SQL I would just write: SELECT COUNT(*) FROM table GROUP BY f1, f2, ..,f4. How to do this in R? Thanks, Nick -- View this message in context: http://www.nabble.com/Group-by-in-R-tp23020587p23020587.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] weighted mean and by() with two index
Sounds like a job for plyr: http://had.co.nz/plyr On Mon, Apr 13, 2009 at 7:56 PM, Dong H. Oh r.arec...@gmail.com wrote: Hi expeRts, I would like to calculate weighted mean by two factors. My code is as follows: R tmp - by(re$meta.sales.lkm[, c(pc, sales)], re$meta.sales.lkm[, c(size, yr)], function(x) weighted.mean(x[,1], x[,2])) The result is as follows: R tmp size: micro yr: 1994 [1] 1.090 size: small yr: 1994 [1] 1.135 size: medium yr: 1994 [1] 1.113 size: large yr: 1994 [1] 1.105 size: micro yr: 1995 [1] 1.167 size: small yr: 1995 [1] 1.096 size: medium yr: 1995 [1] 1.056 But the form I want to get is as follows: 1994 1995 1996 . micro 1.090 1.167 . small 1.135 1.096 medium 1.113 1.056 large 1.105 ... ... That is, the result should be tabularized. How can I get the above form directly? (I don't want to modify tmp with as.vector() and matrix() to get the result) Thank you in advance. -- Donghyun Oh CESIS, KTH -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] taking the log then later exponentiate the result query
Your problem is that with the alpha beta you've specified (((alpha-1)^(alpha-1))*((beta-1)^(beta-1)))/((beta+alpha-2)^(alpha+beta-2)) is Inf/Inf which is NaN. On Sun, Apr 12, 2009 at 5:39 PM, Mary Winter statsstud...@hotmail.com wrote: Hi, I am trying to figure out the observed acceptance rate and M, using generalised rejection sampling to generate a sample from the posterior distribution for p. I have been told my code doesn't work because I need to take the log of the expression for M, evaluate it and then exponentiate the result. This is because R is unable to calculate high powers such as 545.501. As you can see in my code I have tried to taking the log of M and then the exponential of the result, but I clearly must be doing something wrong. I keep getting the error message: Error in if (U = ratio/exp(M)) { : missing value where TRUE/FALSE needed Any ideas how I go about correctly taking the log and then the exponential? rvonmises.norm - function(n,alpha,beta) { out - rep(0,n) counter - 0 total.sim - 0 p-alpha/(alpha+beta) M -logalpha-1)^(alpha-1))*((beta-1)^(beta-1)))/((beta+alpha-2)^(alpha+beta-2))) while(counter n) { total.sim - total.sim+1 proposal - runif(1) if(proposal = 0 proposal = 1) { U - runif(1) ratio- (p^(alpha-1))*((1-p)^(beta-1)) if(U =ratio/exp(M)) { counter - counter+1 out[counter] - proposal } } } obs.acc.rate - n/total.sim return(out,obs.acc.rate,M) } set.seed(220) temp - rvonmises.norm(1,439.544,545.501) print(temp$obs.acc.rate) Louisa Get the New Internet Explore 8 Optimised for MSN. Download Now _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quantative procedure assessing if data is normal
www.rseek.org normality test On Sun, Apr 12, 2009 at 8:45 PM, Henry Cooper henry.1...@hotmail.co.uk wrote: Hi, As part of an R code assingment I have been asked to find a quantitative procedure for assessing whether or not the data are normal? I have previously used the graphical procedure using the qqnorm command. Any help/tips would be greatly appreciated as to how I should start going about this! Henry _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Genstat into R - Randomisation test
Looks like that code implements a non-exhaustive variant of the randomization test, sometimes called a permutation test. Assuming you have a data frame with columns ID CD, this should do it: n.obs = 100 ID=rnorm(n.obs) CD=rnorm(n.obs) obs.cor = cor(ID,CD) num.permutations = 1e4 perm.cor = rep(NA,num.permutations) start.time=proc.time()[1] index = 1:n.obs for(i in 1:num.permutations){ IDorder = sample(index) CDorder = sample(index) perm.cor[i] = .Internal(cor(ID[IDorder], CD[CDorder], 4, FALSE)) } cat('Elapsed time:',start.time-proc.time(1)) sum(perm.corobs.cor)/num.permutations On Wed, Apr 8, 2009 at 10:02 AM, Anne Kempel kem...@ips.unibe.ch wrote: Hello everybody, I have a question. I would like to get a correlation between constitutive and induced plant defence which I messured on 30 plant species. So I have table with Species, Induced defence (ID), and constitutive defence (CD). Since Induced and constitutive defence are not independant (so called spurious correlation) I should do a randomisation test. I have a syntax of my supervisor in Genstat, but I would really like to try this in R. data from trade-off.IDCD list variate [nval=1000] slope calc ID1=ID graph ID; CD calc b=corr(ID; CD) calc slope$[1]=b slope$[1] is the correlation before permutating the data for i=2...1000 randomize ID1 calc b=corr(CD1; ID1) calc slope$[i]=b endfor hist slope describe slope quantile [proportion=!(0.0005,0.005, 0.025, 0.975, 0.995,0.9995)]slope print slope$[1] corr[p=corr] ID,CD DHISTOGRAM [WINDOW=1; ORIENTATION=vertical; KEY=0; BARCOVERING=1.0] slope; PEN=2 Does anybody have done something similar and has any idea how to make the randomisation part? I would be very grateful for any help!! Thanks in advance, Anne -- Anne Kempel Institute of Plant Sciences University of Bern Altenbergrain 21 CH-3103 Bern kem...@ips.unibe.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Genstat into R - Randomisation test
Assuming you have a data frame with columns ID CD, this should do it Oops, the code I sent doesn't assume this. It assumes that you have two vectors, ID CD, as generated in the first 3 lines. On Wed, Apr 8, 2009 at 11:31 AM, Mike Lawrence mike.lawre...@dal.ca wrote: Looks like that code implements a non-exhaustive variant of the randomization test, sometimes called a permutation test. Assuming you have a data frame with columns ID CD, this should do it: n.obs = 100 ID=rnorm(n.obs) CD=rnorm(n.obs) obs.cor = cor(ID,CD) num.permutations = 1e4 perm.cor = rep(NA,num.permutations) start.time=proc.time()[1] index = 1:n.obs for(i in 1:num.permutations){ IDorder = sample(index) CDorder = sample(index) perm.cor[i] = .Internal(cor(ID[IDorder], CD[CDorder], 4, FALSE)) } cat('Elapsed time:',start.time-proc.time(1)) sum(perm.corobs.cor)/num.permutations On Wed, Apr 8, 2009 at 10:02 AM, Anne Kempel kem...@ips.unibe.ch wrote: Hello everybody, I have a question. I would like to get a correlation between constitutive and induced plant defence which I messured on 30 plant species. So I have table with Species, Induced defence (ID), and constitutive defence (CD). Since Induced and constitutive defence are not independant (so called spurious correlation) I should do a randomisation test. I have a syntax of my supervisor in Genstat, but I would really like to try this in R. data from trade-off.IDCD list variate [nval=1000] slope calc ID1=ID graph ID; CD calc b=corr(ID; CD) calc slope$[1]=b slope$[1] is the correlation before permutating the data for i=2...1000 randomize ID1 calc b=corr(CD1; ID1) calc slope$[i]=b endfor hist slope describe slope quantile [proportion=!(0.0005,0.005, 0.025, 0.975, 0.995,0.9995)]slope print slope$[1] corr[p=corr] ID,CD DHISTOGRAM [WINDOW=1; ORIENTATION=vertical; KEY=0; BARCOVERING=1.0] slope; PEN=2 Does anybody have done something similar and has any idea how to make the randomisation part? I would be very grateful for any help!! Thanks in advance, Anne -- Anne Kempel Institute of Plant Sciences University of Bern Altenbergrain 21 CH-3103 Bern kem...@ips.unibe.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Genstat into R - Randomisation test
*Hits head* Of course, the approach taken by your Genstat code of only shuffling one variable is sufficient and faster: n.obs = 100 ID=rnorm(n.obs) CD=rnorm(n.obs) obs.cor = cor(ID,CD) num.permutations = 1e4 perm.cor = rep(NA,num.permutations) start.time=proc.time()[1] index = 1:n.obs for(i in 1:num.permutations){ IDorder = sample(index) perm.cor[i] = .Internal(cor(ID[IDorder], CD, 4, FALSE)) } cat('Elapsed time:',start.time-proc.time(1)) sum(perm.corobs.cor)/num.permutations On Wed, Apr 8, 2009 at 11:33 AM, Mike Lawrence mike.lawre...@dal.ca wrote: Assuming you have a data frame with columns ID CD, this should do it Oops, the code I sent doesn't assume this. It assumes that you have two vectors, ID CD, as generated in the first 3 lines. On Wed, Apr 8, 2009 at 11:31 AM, Mike Lawrence mike.lawre...@dal.ca wrote: Looks like that code implements a non-exhaustive variant of the randomization test, sometimes called a permutation test. Assuming you have a data frame with columns ID CD, this should do it: n.obs = 100 ID=rnorm(n.obs) CD=rnorm(n.obs) obs.cor = cor(ID,CD) num.permutations = 1e4 perm.cor = rep(NA,num.permutations) start.time=proc.time()[1] index = 1:n.obs for(i in 1:num.permutations){ IDorder = sample(index) CDorder = sample(index) perm.cor[i] = .Internal(cor(ID[IDorder], CD[CDorder], 4, FALSE)) } cat('Elapsed time:',start.time-proc.time(1)) sum(perm.corobs.cor)/num.permutations On Wed, Apr 8, 2009 at 10:02 AM, Anne Kempel kem...@ips.unibe.ch wrote: Hello everybody, I have a question. I would like to get a correlation between constitutive and induced plant defence which I messured on 30 plant species. So I have table with Species, Induced defence (ID), and constitutive defence (CD). Since Induced and constitutive defence are not independant (so called spurious correlation) I should do a randomisation test. I have a syntax of my supervisor in Genstat, but I would really like to try this in R. data from trade-off.IDCD list variate [nval=1000] slope calc ID1=ID graph ID; CD calc b=corr(ID; CD) calc slope$[1]=b slope$[1] is the correlation before permutating the data for i=2...1000 randomize ID1 calc b=corr(CD1; ID1) calc slope$[i]=b endfor hist slope describe slope quantile [proportion=!(0.0005,0.005, 0.025, 0.975, 0.995,0.9995)]slope print slope$[1] corr[p=corr] ID,CD DHISTOGRAM [WINDOW=1; ORIENTATION=vertical; KEY=0; BARCOVERING=1.0] slope; PEN=2 Does anybody have done something similar and has any idea how to make the randomisation part? I would be very grateful for any help!! Thanks in advance, Anne -- Anne Kempel Institute of Plant Sciences University of Bern Altenbergrain 21 CH-3103 Bern kem...@ips.unibe.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Genstat into R - Randomisation test
*sigh* and the elapsed time line should be: cat('Elapsed time:',proc.time()[1]-start.time) On Wed, Apr 8, 2009 at 11:35 AM, Mike Lawrence mike.lawre...@dal.ca wrote: *Hits head* Of course, the approach taken by your Genstat code of only shuffling one variable is sufficient and faster: n.obs = 100 ID=rnorm(n.obs) CD=rnorm(n.obs) obs.cor = cor(ID,CD) num.permutations = 1e4 perm.cor = rep(NA,num.permutations) start.time=proc.time()[1] index = 1:n.obs for(i in 1:num.permutations){ IDorder = sample(index) perm.cor[i] = .Internal(cor(ID[IDorder], CD, 4, FALSE)) } cat('Elapsed time:',start.time-proc.time(1)) sum(perm.corobs.cor)/num.permutations On Wed, Apr 8, 2009 at 11:33 AM, Mike Lawrence mike.lawre...@dal.ca wrote: Assuming you have a data frame with columns ID CD, this should do it Oops, the code I sent doesn't assume this. It assumes that you have two vectors, ID CD, as generated in the first 3 lines. On Wed, Apr 8, 2009 at 11:31 AM, Mike Lawrence mike.lawre...@dal.ca wrote: Looks like that code implements a non-exhaustive variant of the randomization test, sometimes called a permutation test. Assuming you have a data frame with columns ID CD, this should do it: n.obs = 100 ID=rnorm(n.obs) CD=rnorm(n.obs) obs.cor = cor(ID,CD) num.permutations = 1e4 perm.cor = rep(NA,num.permutations) start.time=proc.time()[1] index = 1:n.obs for(i in 1:num.permutations){ IDorder = sample(index) CDorder = sample(index) perm.cor[i] = .Internal(cor(ID[IDorder], CD[CDorder], 4, FALSE)) } cat('Elapsed time:',start.time-proc.time(1)) sum(perm.corobs.cor)/num.permutations On Wed, Apr 8, 2009 at 10:02 AM, Anne Kempel kem...@ips.unibe.ch wrote: Hello everybody, I have a question. I would like to get a correlation between constitutive and induced plant defence which I messured on 30 plant species. So I have table with Species, Induced defence (ID), and constitutive defence (CD). Since Induced and constitutive defence are not independant (so called spurious correlation) I should do a randomisation test. I have a syntax of my supervisor in Genstat, but I would really like to try this in R. data from trade-off.IDCD list variate [nval=1000] slope calc ID1=ID graph ID; CD calc b=corr(ID; CD) calc slope$[1]=b slope$[1] is the correlation before permutating the data for i=2...1000 randomize ID1 calc b=corr(CD1; ID1) calc slope$[i]=b endfor hist slope describe slope quantile [proportion=!(0.0005,0.005, 0.025, 0.975, 0.995,0.9995)]slope print slope$[1] corr[p=corr] ID,CD DHISTOGRAM [WINDOW=1; ORIENTATION=vertical; KEY=0; BARCOVERING=1.0] slope; PEN=2 Does anybody have done something similar and has any idea how to make the randomisation part? I would be very grateful for any help!! Thanks in advance, Anne -- Anne Kempel Institute of Plant Sciences University of Bern Altenbergrain 21 CH-3103 Bern kem...@ips.unibe.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Genstat into R - Randomisation test
I was taught that Fisher proposed the F-test as a computationally simpler approximation to what he called a Randomization test, consisting of exhaustive permutations. I never looked at the original Fisher reference myself, so this may be false. However, I haven't observed a consistent nomenclature when I have seen these tests discussed, so I typically ensure to mention whether what I'm doing is exhaustive or non-exhaustive. I do see the value in your interpretation, and think it makes sense to drop randomization as a name (despite it's possible historical significance) and start using exhaustive permutation test (to contrast with non-exhaustive permutation test). On Wed, Apr 8, 2009 at 1:18 PM, Peter Dalgaard p.dalga...@biostat.ku.dk wrote: Mike Lawrence wrote: Looks like that code implements a non-exhaustive variant of the randomization test, sometimes called a permutation test. Isn't it the other way around? (Permutation tests can be exhaustive by looking at all permutations, if a randomization test did that, then it wouldn't be random.) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] threshold distribution
You didn't properly specify the x-axis coordinates in your call to lines(): plot(density(x)) lines(x,dlnorm(x, -.1348, .1911), col='red') points(x,dlnorm(x, -.1348, .1911), col='blue') curve(dlnorm(x, -.1348, .1911), col='green',add=T) On Sun, Apr 5, 2009 at 6:40 AM, Bernardo Rangel Tura t...@centroin.com.br wrote: On Sun, 2009-04-05 at 01:13 -0400, jim holtman wrote: Here is what I get from using 'fitdistr' in R to fit to a lognormal. The resulting density plot from the distribution seems to be a reason match to the data. x - scan() 1: 0.80010 0.72299 0.69893 0.99597 0.89200 0.69312 0.73613 1.13559 9: 0.85009 0.85804 0.73324 1.04826 0.84002 14: 1.76330 0.71980 0.89416 0.89450 0.98670 0.83571 0.73833 0.66549 22: 0.93641 0.80418 0.95285 0.76876 0.82588 27: 1.09394 1.00195 1.14976 0.80008 1.11947 1.09484 0.81494 0.68696 35: 0.82364 0.84390 0.71402 0.80293 1.02873 40: Read 39 items plot(density(x)) library(MASS) fitdistr(x, 'lognormal') meanlog sdlog -0.13480636 0.19118861 ( 0.03061468) ( 0.02164785) lines(dlnorm(x, -.1348, .1911), col='red') Hi Jim I agree with your solution but my plot result not fine. I obtain same result. fitdistr(x, 'lognormal') meanlog sdlog -0.13480636 0.19118861 ( 0.03061468) ( 0.02164785) In plot when I use points (blue) and curve (green) the fit o lognormal and density(data) is fine but when I use lines (red) the fit is bad (in attach I send a PDF of my output) Do you know why this happen? Thanks in advance -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil P.S. my script is x - scan() 0.80010 0.72299 0.69893 0.99597 0.89200 0.69312 0.73613 1.13559 0.85009 0.85804 0.73324 1.04826 0.84002 1.76330 0.71980 0.89416 0.89450 0.98670 0.83571 0.73833 0.66549 0.93641 0.80418 0.95285 0.76876 0.82588 1.09394 1.00195 1.14976 0.80008 1.11947 1.09484 0.81494 0.68696 0.82364 0.84390 0.71402 0.80293 1.02873 require(MASS) fitdistr(x, 'lognormal') pdf(adj.pdf) plot(density(x)) lines(dlnorm(x, -.1348, .1911), col='red') points(x,dlnorm(x, -.1348, .1911), col='blue') curve(dlnorm(x, -.1348, .1911), col='green',add=T) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures ANOVA - among group differences
If Month is nested within Quadrat I think you want: aov(ProportioninTreatment ~ Treatment*Month +Error(Quadrat/Month), RM) If Treatment is also nested within Quadrat, you want: aov(ProportioninTreatment ~ Treatment*Month +Error(Quadrat/(Treatment*Month)), RM) On Wed, Apr 1, 2009 at 12:42 AM, Jessica L Hite/hitejl/O/VCU hit...@vcu.edu wrote: I have data on the proportion of clutches experiencing different fates (e.g., 4 different sources of mortality) for 5 months . I need to test 1) if the overall proportion of these different fates is different over the entire study and 2) to see if there are monthly differences within (and among) fate types. Thus, I am pretty sure this is an RM analysis -( I measure the same quadrats each month). I am fine running the analysis in R - with the code below, however, there is no output for the among group variation...this is an important component - any ideas on how to solve this problem? I have included code and sample data below. Many thanks in advance for help and suggestions. J both.aov - aov(ProportioninTreatment ~ factor(Treatment)*factor(Month) + Error(factor(Quadrat)), RM) Error: factor(id) Df Sum Sq Mean Sq F value Pr(F) Residuals 3 0.51619 0.17206 #why only partial output here? ### Error: Within Df Sum Sq Mean Sq F value Pr(F) factor(Fate1) 3 1.2453 0.4151 3.5899 0.017907 * time 1 0.9324 0.9324 8.0637 0.005929 ** factor(Fate1):time 3 0.9978 0.3326 2.8763 0.042272 * Residuals 69 7.9783 0.1156 Fate1 Proportion in Fate ASIN Month Quadrat 1 0.117647059 0.350105778 1 1 1 0 0 2 1 1 0.1 0.339836909 3 1 1 0 0 4 1 1 0 0 5 1 1 0 0 1 2 1 0 0 2 2 1 0.2 0.463647609 3 2 1 0.25 0.523598776 4 2 1 0.1 0.339836909 5 2 1 0 0 1 3 1 0 0 2 3 1 0 0 3 3 1 0.384615385 0.668964075 4 3 1 0 0 5 3 1 0 0 1 4 1 0 0 2 4 1 0 0 3 4 1 0.16667 0.420534336 4 4 1 0 0 5 4 2 0.352941176 0.636132062 1 1 2 0.2 0.463647609 2 1 2 0.3 0.615479708 3 1 2 1 1.570796327 4 1 2 0 0 5 1 2 0.5 0.785398163 1 2 2 0 0 2 2 2 0.6 0.886077124 3 2 2 0.41667 0.701674124 4 2 2 0.2 0.490882678 5 2 2 0 0 1 3 2 0.2 0.463647609 2 3 2 0 0 3 3 2 0.461538462 0.746898594 4 3 2 0 0 5 3 2 0 0 1 4 2 0 0 2 4 2 0.307692308 0.588002604 3 4 2 0.7 0.955316618 4 4 2 0 0 5 4 3 0 0 1 1 3 0 0 2 1 3 0.4 0.729727656 3 1 3 0 0 4 1 3 1 1.570796327 5 1 3 0.5 0.785398163 1 2 3 0 0 2 2 3 0 0 3 2 3 0.25 0.523598776 4 2 3 0.6 0.841068671 5 2 3 0 0 1 3 3 0 0 2 3 3 0 0 3 3 3 0.153846154 0.403057075 4 3 3 0.7 0.955316618 5 3 3 0 0 1 4 3 0 0 2 4 3 0 0 3 4 3 0 0 4 4 3 0.875 1.209429203 5 4 4 0.294117647 0.573203309 1 1 4 0.2 0.463647609 2 1 4 0 0 3 1 4 0 0 4 1 4 0 0 5 1 4 0 0 1 2 4 0 0 2 2 4 0 0 3 2 4 0.08333 0.292842771 4 2 4 0.1 0.339836909 5 2 4 0 0 1 3 4 0 0 2 3 4 0 0 3 3 4 0 0 4 3 4 0.16667 0.420534336 5 3 4 0 0 1 4 4 0 0 2 4 4 0.461538462 0.746898594 3 4 4 0 0 4 4 4 0.125 0.361367124 5 4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures ANOVA - among group differences
Hm, it seems I possibly used the technical term nested inappropriately in my response. I meant: If Month is a repeated measure within each Quadrat... and If Treatment is also a repeated measure within each Quadrat... On Wed, Apr 1, 2009 at 8:21 AM, Mike Lawrence mike.lawre...@dal.ca wrote: If Month is nested within Quadrat I think you want: aov(ProportioninTreatment ~ Treatment*Month +Error(Quadrat/Month), RM) If Treatment is also nested within Quadrat, you want: aov(ProportioninTreatment ~ Treatment*Month +Error(Quadrat/(Treatment*Month)), RM) On Wed, Apr 1, 2009 at 12:42 AM, Jessica L Hite/hitejl/O/VCU hit...@vcu.edu wrote: I have data on the proportion of clutches experiencing different fates (e.g., 4 different sources of mortality) for 5 months . I need to test 1) if the overall proportion of these different fates is different over the entire study and 2) to see if there are monthly differences within (and among) fate types. Thus, I am pretty sure this is an RM analysis -( I measure the same quadrats each month). I am fine running the analysis in R - with the code below, however, there is no output for the among group variation...this is an important component - any ideas on how to solve this problem? I have included code and sample data below. Many thanks in advance for help and suggestions. J both.aov - aov(ProportioninTreatment ~ factor(Treatment)*factor(Month) + Error(factor(Quadrat)), RM) Error: factor(id) Df Sum Sq Mean Sq F value Pr(F) Residuals 3 0.51619 0.17206 #why only partial output here? ### Error: Within Df Sum Sq Mean Sq F value Pr(F) factor(Fate1) 3 1.2453 0.4151 3.5899 0.017907 * time 1 0.9324 0.9324 8.0637 0.005929 ** factor(Fate1):time 3 0.9978 0.3326 2.8763 0.042272 * Residuals 69 7.9783 0.1156 Fate1 Proportion in Fate ASIN Month Quadrat 1 0.117647059 0.350105778 1 1 1 0 0 2 1 1 0.1 0.339836909 3 1 1 0 0 4 1 1 0 0 5 1 1 0 0 1 2 1 0 0 2 2 1 0.2 0.463647609 3 2 1 0.25 0.523598776 4 2 1 0.1 0.339836909 5 2 1 0 0 1 3 1 0 0 2 3 1 0 0 3 3 1 0.384615385 0.668964075 4 3 1 0 0 5 3 1 0 0 1 4 1 0 0 2 4 1 0 0 3 4 1 0.16667 0.420534336 4 4 1 0 0 5 4 2 0.352941176 0.636132062 1 1 2 0.2 0.463647609 2 1 2 0.3 0.615479708 3 1 2 1 1.570796327 4 1 2 0 0 5 1 2 0.5 0.785398163 1 2 2 0 0 2 2 2 0.6 0.886077124 3 2 2 0.41667 0.701674124 4 2 2 0.2 0.490882678 5 2 2 0 0 1 3 2 0.2 0.463647609 2 3 2 0 0 3 3 2 0.461538462 0.746898594 4 3 2 0 0 5 3 2 0 0 1 4 2 0 0 2 4 2 0.307692308 0.588002604 3 4 2 0.7 0.955316618 4 4 2 0 0 5 4 3 0 0 1 1 3 0 0 2 1 3 0.4 0.729727656 3 1 3 0 0 4 1 3 1 1.570796327 5 1 3 0.5 0.785398163 1 2 3 0 0 2 2 3 0 0 3 2 3 0.25 0.523598776 4 2 3 0.6 0.841068671 5 2 3 0 0 1 3 3 0 0 2 3 3 0 0 3 3 3 0.153846154 0.403057075 4 3 3 0.7 0.955316618 5 3 3 0 0 1 4 3 0 0 2 4 3 0 0 3 4 3 0 0 4 4 3 0.875 1.209429203 5 4 4 0.294117647 0.573203309 1 1 4 0.2 0.463647609 2 1 4 0 0 3 1 4 0 0 4 1 4 0 0 5 1 4 0 0 1 2 4 0 0 2 2 4 0 0 3 2 4 0.08333 0.292842771 4 2 4 0.1 0.339836909 5 2 4 0 0 1 3 4 0 0 2 3 4 0 0 3 3 4 0 0 4 3 4 0.16667 0.420534336 5 3 4 0 0 1 4 4 0 0 2 4 4 0.461538462 0.746898594 3 4 4 0 0 4 4 4 0.125 0.361367124 5 4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence
Re: [R] R Language Bloggers and Web Sites Owners
Hm, this feels reminiscent of the Wellsphere free blog content scam: http://neurocritic.blogspot.com/2009/02/wealthcentral.html -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help about: anova and population no normal
Those with more formal statistical backgrounds may provide better advice, but in my own informal training I've come to wonder why parametric stats persist in the face of modern computing power. As I understand it, Fisher developed ANOVA as low-computation method of approximating the Randomization Test (a.k.a. exhaustive permutation test). Where computation power has grown exponentially since Fisher's time, these days it is feasible to compute the full R-Test in many cases, and for those cases where the full R-Test is not feasible, non-exhaustive permutation variants usually satisfy. Indeed, it has been shown (http://brm.psychonomic-journals.org/content/37/3/426.short) that the R-Test is more powerful than the F-Test in the face of skewed distributions. My advice would thus be to abandon parametrics and simply code a randomization test variant of the ANOVA you want. Mike On Tue, Mar 31, 2009 at 10:29 AM, Sarti Maurizio sart...@irea.cnr.it wrote: Dear R-Helpers, Parametric statistics are statistics where the population is assumed to fit any parametrized distributions (most typically the normal distribution). My problem is: 1) if my polulation is no normal 2) if the sample data of all replications and treatments were well fitted from the Weibull distribution (shape and scale parameters). Can be The shape and scale parameters compared between treatments by using the canonical analysis of the variance ANOVA? Many thanks for your help with these questions. Maurizio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help about: anova and population no normal
Oh, and to answer your question more directly, the randomization test permits testing hypotheses using any metric, so scale shape are definitely testable. Typically one is interested in means, so on each iteration of the test loop one computes the group/condition means. However, it's simple to instead compute, say, the best fit weibull to each condition and simultaneously test shift, shape, and scale. On Tue, Mar 31, 2009 at 12:18 PM, Mike Lawrence mike.lawre...@dal.ca wrote: Those with more formal statistical backgrounds may provide better advice, but in my own informal training I've come to wonder why parametric stats persist in the face of modern computing power. As I understand it, Fisher developed ANOVA as low-computation method of approximating the Randomization Test (a.k.a. exhaustive permutation test). Where computation power has grown exponentially since Fisher's time, these days it is feasible to compute the full R-Test in many cases, and for those cases where the full R-Test is not feasible, non-exhaustive permutation variants usually satisfy. Indeed, it has been shown (http://brm.psychonomic-journals.org/content/37/3/426.short) that the R-Test is more powerful than the F-Test in the face of skewed distributions. My advice would thus be to abandon parametrics and simply code a randomization test variant of the ANOVA you want. Mike On Tue, Mar 31, 2009 at 10:29 AM, Sarti Maurizio sart...@irea.cnr.it wrote: Dear R-Helpers, Parametric statistics are statistics where the population is assumed to fit any parametrized distributions (most typically the normal distribution). My problem is: 1) if my polulation is no normal 2) if the sample data of all replications and treatments were well fitted from the Weibull distribution (shape and scale parameters). Can be The shape and scale parameters compared between treatments by using the canonical analysis of the variance ANOVA? Many thanks for your help with these questions. Maurizio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to input multiple .txt files
oops, didn't read the question fully. If you want to create 2 master files: cust1_files = list.files(path=path_to_my_files,pattern='cust1',full.names=TRUE) a=NULL for(this_file in cust1_files){ a=rbind(a,read.table(this_file)) } write.table(a,'cust1.master.txt') cust2_files = list.files(path=path_to_my_files,pattern='cust2',full.names=TRUE) a=NULL for(this_file in cust2_files){ a=rbind(a,read.table(this_file)) } write.table(a,'cust2.master.txt') On Mon, Mar 30, 2009 at 8:55 AM, Mike Lawrence mike.lawre...@dal.ca wrote: my_files = list.files(path=path_to_my_files,pattern='.txt',full.names=TRUE) a=NULL for(this_file in my_files){ a=rbind(a,read.table(this_file)) } write.table(a,my_new_file_name) On Sun, Mar 29, 2009 at 10:37 PM, Qianfeng Li qflic...@yahoo.com wrote: how to input multiple .txt files? A data folder has lots of .txt files from different customers. Want to read all these .txt files to different master files: such as: cust1.xx.txt, cust1.xxx.txt, cust1..txt,.. to master file: X.txt cust2.xx.txt, cust2.xxx.txt, cust2..txt,.. to master file: Y.txt Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to input multiple .txt files
my_files = list.files(path=path_to_my_files,pattern='.txt',full.names=TRUE) a=NULL for(this_file in my_files){ a=rbind(a,read.table(this_file)) } write.table(a,my_new_file_name) On Sun, Mar 29, 2009 at 10:37 PM, Qianfeng Li qflic...@yahoo.com wrote: how to input multiple .txt files? A data folder has lots of .txt files from different customers. Want to read all these .txt files to different master files: such as: cust1.xx.txt, cust1.xxx.txt, cust1..txt,.. to master file: X.txt cust2.xx.txt, cust2.xxx.txt, cust2..txt,.. to master file: Y.txt Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to input multiple .txt files
To repent for my sins, I'll also suggest that Hadley Wickham's plyr package (http://had.co.nz/plyr/) is also useful/parsimonious in this context: a - ldply(cust1_files,read.table) On Mon, Mar 30, 2009 at 9:32 AM, baptiste auguie ba...@exeter.ac.uk wrote: may i suggest the following, a - do.call(rbind, lapply(cust1_files, read.table)) (i believe expanding objects in a for loop belong to the R Inferno) baptiste On 30 Mar 2009, at 12:58, Mike Lawrence wrote: cust1_files = list.files(path=path_to_my_files,pattern='cust1',full.names=TRUE) a=NULL for(this_file in cust1_files){ a=rbind(a,read.table(this_file)) } write.table(a,'cust1.master.txt') _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loading and manipulating 10 data frames-simplified
=cumsum(Bin10_Acres/Bin10Acres_sum) #Calculates the probability of choosing particular parcel from bin Bin10_parprob=abs(1-Bin10_cumper) #Combines parcel acreage data and cumlative percentage data Bin10Main.data = cbind(Bin10_Acres,Bin10_parprob,Bin10_TAZ,Bin10_TAZvacant) #-- I have tried : Bin - lapply(1:10, function(.file){ #Loads bin data frame from csv files with acres and TAZ data fileName -paste(I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/Bin_lookup_values/Bin,.file, _lookup.csv, sep=) Bin_main - read.csv(file=fileName,head=FALSE); #Separates Acres data from main data and converts acres to square feet Bin_Acres=Bin_main[[1]]*43560 #Separates TAZ data from main data Bin_TAZ=Bin_main[[2]] #Separates TAZ data from main data and converts acres to square feet Bin_TAZvacant=Bin_main[[3]]*43560 #Sums each parcel acreage data of the bin BinAcres_sum=sum(Bin_Acres) #Creates data frame of cumlative percentages of each parcel of bin Bin_cumper=cumsum(Bin_Acres/BinAcres_sum) #Calculates the probability of choosing particular parcel from bin Bin_parprob=abs(1-Bin_cumper) #Combines parcel acreage data and cumlative percentage data BinMain.data = cbind(Bin_Acres,Bin_parprob,Bin_TAZ,Bin_TAZvacant) }) #- I am thinking that a simple for loop would work with a paste command but that appoach has yet to work either. Thanks Cheers, JR -- View this message in context: http://www.nabble.com/loading-and-manipulating-10-data-frames-simplified-tp22731441p22731441.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Way to rotate a histogram?
In case anyone is still interested, a slight improvement is to plot both density and normal distributions on top of the empirical histogram (previous version plotted only density): library(ggplot2) test_data-rnorm(100,mean=10,sd=4) a = data.frame(obs = test_data,condition = 'None') p1 = ggplot( data = a ,aes( x = obs ) )+geom_histogram( aes( y = ..density.. ) )+stat_density( mapping=aes(ymax=max(..density..)) ,geom='path' ,colour='red' )+stat_function( fun = dnorm ,args = list( m=mean(a$obs) ,sd=sd(a$obs) ) ,colour = 'green' )+scale_x_continuous( limits = range(a$obs) )+opts( panel.grid.minor = theme_blank() ,panel.grid.major = theme_blank() ,panel.background = theme_rect() )+coord_flip( ) p2 = ggplot( data = a ,aes( x = condition ,y = obs ) )+geom_boxplot( )+scale_y_continuous( limits = range(a$obs) )+scale_x_discrete( name = '' ,labels = '' )+opts( panel.grid.minor = theme_blank() ,panel.grid.major = theme_blank() ,panel.background = theme_rect() ,axis.ticks = theme_blank() ,axis.text.y = theme_blank() ,axis.title.y = theme_blank() ) p3 = ggplot( data = a ,aes( sample = (obs-mean(obs))/sd(obs) ) )+stat_qq( distribution=qnorm )+geom_abline( intercept=0 ,slope=1 )+opts( panel.grid.minor = theme_blank() ,panel.grid.major = theme_blank() ,panel.background = theme_rect() ,axis.ticks = theme_blank() ,axis.text.y = theme_blank() ,axis.title.y = theme_blank() ) print(p1,vp = viewport(width = 1/3,height = 1,x = 1/3*.5,y = .5)) print(p2,vp = viewport(width = 1/3,height = 1,x = 1/3+1/3*.5,y = .5)) print(p3,vp = viewport(width = 1/3,height = 1,x = 2/3+1/3*.5,y = .5)) -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] geometric mean of probability density functions
If you're interested in comparing empirical to simulation distributions, I see two alternatives to your density() approach (which will be sensitive to your choice of bandwidth). Both of the following have been used in my field to look at the fit of empirical response time data to models of human information processing: (1) estimate some number of quantiles for each set of 1000 replicates, average the quantile points, then compare the results to a similar set of quantiles generated from your simulation data. (2) fit an a priori distribution to each set of 1000 replicates, within each distribution parameter, average across sets, then compare to parameters obtained from fitting the simulation data. #generate some fake simulation data sim.x = rnorm(1e5,100,20) #make up some fake empirical data a=data.frame( set = rep(1:8,each=1e3) ,x=rnorm(8*1e3,100,20)+rexp(8*1e3,50) ) #define a function to compute the geometric mean ##(fails with negative numbers!) geometric.mean = function(x) exp(mean(log(x))) # Quantile approach #set up a data frame to collect empirical quantile data emp.q=expand.grid( set=unique(a$set) ,q=seq(.1,.9,.1) ,x=NA ) #estimate empirical quantiles for(set in unique(a$set)){ emp.q$x[emp.q$set==set] = quantile(a$x[a$set==set],probs=unique(emp.q$q)) } #compute the geomentric mean for each empirical quantile emp.q.means = with( emp.q ,aggregate( x ,list( q=q ) ,geometric.mean ) ) #estimate the quantiles from the simulation data sim.q = quantile(sim.x,unique(emp.q$q)) #assess the fit using sum squared scaled error quantile.sum.sq.sc.err = sum(((emp.q.means$x-sim.q)/sim.q)^2) print(quantile.sum.sq.sc.err) # Fitting approach using the Generalized Lambda distribution ## GLD chosen because it's flexible and easily fit using the ### gld package #install the gld package if necessary if(!('gld' %in% installed.packages())){ install.packages('gld') } #load the gld package library(gld) #set up a data frame to collect empirical GLD parameters emp.gld.par=expand.grid( set=unique(a$set) ,lambda=1:4 ,x=NA ) #estimate empirical GLD parameters ## print-out keeps an eye on convergence (hopefully 0) ## takes a while to complete for(set in unique(a$set)){ fit = starship(a$x[a$set==set]) cat('Set:',set,', convergence:',fit$optim.results$convergence,'\n') emp.gld.par$x[emp.gld.par$set==set] = fit$lambda } #compute the geomentric mean for each empirical GLD parameter emp.gld.par.means = with( emp.gld.par ,aggregate( x ,list( lambda=lambda ) ,geometric.mean ) ) #estimate the GLD parameters from the simulation data sim.fit = starship(sim.x) cat('Sim convergence:',sim.fit$optim.results$convergence,'\n') sim.gld.par = sim.fit$lambda #assess the fit using sum squared scaled error gld.par.sum.sq.sc.err = sum(((emp.gld.par.means$x-sim.gld.par)/sim.gld.par)^2) print(gld.par.sum.sq.sc.err) -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Way to rotate a histogram?
Using ggplot2: library(ggplot2) test_data-rnorm(100) a=data.frame(obs=test_data,condition='None') p1=qplot( data=a ,x=obs ,geom='histogram' )+coord_flip() p2=qplot( data=a ,y=obs ,x=condition ,geom='boxplot' )+opts( axis.text.y=theme_blank() ,axis.title.y=theme_blank() )+coord_flip() p3=qplot( sample=test_data ,stat='qq' ,distribution=qnorm )+coord_flip() print(p1,vp=viewport(width=1,height=1/3,x=.5,y=1/3*.5)) print(p2,vp=viewport(width=1,height=1/3,x=.5,y=1/3+1/3*.5)) print(p3,vp=viewport(width=1,height=1/3,x=.5,y=2/3+1/3*.5)) On Tue, Mar 17, 2009 at 12:38 PM, Jason Rupert jasonkrup...@yahoo.com wrote: Here is what I have so far: test_data-rnorm(100) par(mfrow=c(1,3)) # 3 rows by 1 columns layout of plots hist(test_data) boxplot(test_data) qqnorm(test_data) I noticed that I can rotate a boxplot via horizontal, but apparently hist does not have that functionality. I tried stacking the plots vertically: test_data-rnorm(100) par(mfrow=c(3,1)) # 3 rows by 1 columns layout of plots hist(test_data) boxplot(test_data, horizontal=TRUE) qqnorm(test_data) However, I would have to rotate the QQnorm plot, which would be pretty confusing and I think non-standard. Thank you again for any feedback and insight regarding trying to reproduce the JMP figure shown at: http://n2.nabble.com/Can-R-produce-this-plot--td2489288.html --- On Tue, 3/17/09, Jason Rupert jasonkrup...@yahoo.com wrote: From: Jason Rupert jasonkrup...@yahoo.com Subject: Re: [R] R package to automatically produce combination plot? To: R-help@r-project.org Date: Tuesday, March 17, 2009, 9:39 AM I guess no reply means there is not an existing package to produce the plot? I will post the results of my script to hopefully help others who are trying to formulate the same plot. Thanks again. --- On Mon, 3/16/09, Jason Rupert jasonkrup...@yahoo.com wrote: From: Jason Rupert jasonkrup...@yahoo.com Subject: [R] R package to automatically produce combination plot? To: R-help@r-project.org Date: Monday, March 16, 2009, 8:14 PM By any chance is there an R package that automatically produces the plot shown at the following link: http://n2.nabble.com/Can-R-produce-this-plot--td2489288.html That is an R package to produce on plot that has the following: (a) a vertically oriented histogram, (b) associated barplot, and (c) quantile-quantile plot (Q-Q Plot). This is based on a class lecture from University of Pennsylvania: stat.wharton.upenn.edu/~mcjon/stat-431/lecture-02.pdf I am pretty confident I can put one together, but just wanted to check that there does not already exist an R package to output such a plot. Thanks again. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Way to rotate a histogram?
I actually pasted the wrong code! This attempts to replicate the original request to replicate a JMP graphic: library(ggplot2) test_data-rnorm(100,mean=10,sd=4) a = data.frame(obs = test_data,condition = 'None') p1 = ggplot( data = a ,aes( x = obs ) )+geom_histogram( aes( y = ..density.. ) )+geom_density( )+scale_x_continuous( limits = range(a$obs) )+opts( panel.grid.minor = theme_blank() ,panel.grid.major = theme_blank() ,panel.background = theme_rect() )+coord_flip( ) p2 = ggplot( data = a ,aes( x = condition ,y = obs ) )+geom_boxplot( )+scale_y_continuous( limits = range(a$obs) )+scale_x_discrete( name = '' ,labels = '' )+opts( panel.grid.minor = theme_blank() ,panel.grid.major = theme_blank() ,panel.background = theme_rect() ,axis.ticks = theme_blank() ,axis.text.y = theme_blank() ,axis.title.y = theme_blank() ) p3 = ggplot( data = a ,aes( sample = (obs-mean(obs))/sd(obs) ) )+stat_qq( distribution=qnorm )+geom_abline( intercept=0 ,slope=1 )+opts( panel.grid.minor = theme_blank() ,panel.grid.major = theme_blank() ,panel.background = theme_rect() ,axis.ticks = theme_blank() ,axis.text.y = theme_blank() ,axis.title.y = theme_blank() ) print(p1,vp = viewport(width = 1/3,height = 1,x = 1/3*.5,y = .5)) print(p2,vp = viewport(width = 1/3,height = 1,x = 1/3+1/3*.5,y = .5)) print(p3,vp = viewport(width = 1/3,height = 1,x = 2/3+1/3*.5,y = .5)) On Tue, Mar 17, 2009 at 6:36 PM, David Winsemius dwinsem...@comcast.net wrote: Nice work, Mike. Actually I think he was looking for it done this way. library(ggplot2) test_data-rnorm(100) a=data.frame(obs=test_data,condition='None') p1=qplot( data=a ,x=obs ,geom='histogram' )+coord_flip() p2=qplot( data=a ,y=obs ,x=condition ,geom='boxplot' )+opts( axis.text.y=theme_blank() ,axis.title.y=theme_blank() ) p3=qplot( sample=test_data ,stat='qq' ,distribution=qnorm ) print(p1,vp=viewport(width=1/6,height=1,y=.5,x=1/6*.5)) print(p2,vp=viewport(width=1/6,height=1,y=.5,x=1/6+1/6*.5)) print(p3,vp=viewport(width=2/3,height=1,y=.5,x=1/3+2/3*.5)) Perhaps with a red line through the hinge points. And probably need to get the axes aligned proerly but it's very close. -- david Winsemius On Mar 17, 2009, at 5:14 PM, Mike Lawrence wrote: library(ggplot2) test_data-rnorm(100) a=data.frame(obs=test_data,condition='None') p1=qplot( data=a ,x=obs ,geom='histogram' )+coord_flip() p2=qplot( data=a ,y=obs ,x=condition ,geom='boxplot' )+opts( axis.text.y=theme_blank() ,axis.title.y=theme_blank() )+coord_flip() p3=qplot( sample=test_data ,stat='qq' ,distribution=qnorm )+coord_flip() print(p1,vp=viewport(width=1,height=1/3,x=.5,y=1/3*.5)) print(p2,vp=viewport(width=1,height=1/3,x=.5,y=1/3+1/3*.5)) print(p3,vp=viewport(width=1,height=1/3,x=.5,y=2/3+1/3*.5)) David Winsemius, MD Heritage Laboratories West Hartford, CT -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find maximum values on the density function of a random variable
rv - rbinom(1,1,0.1) + rnorm(1) d.rv = density(rv) d.x = d.rv$x d.y = d.rv$y d.rv.max = d.rv$x[which.max(d.rv$y)] plot(d.rv) abline(v=d.rv.max) #that what you want? On Thu, Mar 12, 2009 at 6:28 PM, g...@ucalgary.ca wrote: I would like to find the maximum values on the density function of a random variable. For example, I have a random variable rv - rbinom(1,1,0.1) + rnorm(1) Its density function is given by density(rv) and can be displayed by plot(density(rv)). How to calculate its maximum values? A density function may have a few (global and local) maximum values. Please help. Thanks, -james __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] OT: Likelihood ratio for the randomization/permutation test?
Hi guRus, My discipline (experimental psychology) is gradually moving away from Null Hypothesis Testing and towards measures of evidence. One measure of evidence that has been popular of late is the likelihood ratio. Glover Dixon (2005) demonstrate the calculation of the likelihood ratio from ANOVA tables, but I'm also interested in non-parametric statistics and wonder if anyone has any ideas on how to compute a likelihood ratio from a randomization test (aka. permutation test)? Say one had two groups and were interested in whether the mean scores of the two groups differ in a manner consistent with random chance or in a manner consistent with a non-null effect of some manipulation applied to the two groups. The randomization test addresses this by randomly re-assigning the participants to the groups, re-computing the difference between means, and repeating many times, yielding a distribution of simulated difference scores that represents the distribution expected by chance. Within a Null Hypothesis Testing framework you then estimate the probability of the null by observing the proportion of simulated difference scores that are greater in magnitude than the observed difference score. Any guesses on how to translate this into a quantification of evidence? Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Centering multi-line strip text in lattice
Perfect, thanks! On Wed, Mar 11, 2009 at 7:41 PM, Paul Murrell p.murr...@auckland.ac.nz wrote: Hi Mike Lawrence wrote: I'm having trouble centering multi-line strip text in lattice. As the code below demonstrates bounding box of the text is centered within the strip, but the first line isn't centered in relation to the longer second line. The adj argument to par.strip.text doesn't seem to do much. Suggestions? a=data.frame( x=rep(1:10,2) ,y=rep(1:10,2) ,z=rep(c('First Line\nLonger Second Line (1)','First Line\nLonger Second Line (2)'),each=10) ) xyplot( y~x|z ,data=a ,par.strip.text = list(cex = .75, lineheight=1, lines = 2, adj=.5), ) Here's one way, by writing your own strip function that calls the default strip function with blanked out labels then draws the labels directly with 'grid' calls. xyplot( y~x|z ,data=a ,par.strip.text = list(lines = 1.5), strip=function(which.panel, factor.levels, ...) { strip.default(which.panel=which.panel, factor.levels=rep(, length(factor.levels)), ...) pushViewport(viewport(clip=on)) grid.text(factor.levels[which.panel], gp=gpar(cex=.75, lineheight=1)) popViewport() } ) Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 p...@stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] find the important inputs to the neural network model in nnet package
One thought is to train the net and obtain a performance measure on a testing corpus. Next, for each input, run the testing corpus again, but zero all values for that input and obtain a measure of performance. Zeroing an important node will hurt performance more than zeroing an unimportant node. On Tue, Mar 10, 2009 at 9:41 AM, abbas tavassoli tavassoli...@yahoo.com wrote: Hi, I have a binary variable and many explanatory variables and I want to use the package nnet to model these data, (instead of logistic regression). I want to find the more effective variables (inputs to the network) in the neural network model. how can I do this? thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple question beginner
names() is a great function for finding out how to get info from objects of unknown structure, so try: names(out) 2009/3/10 Λεωνίδας Μπαντής bleonida...@yahoo.gr: Hi there, I am beginner in R and I have some basic question. Suppose I run a common procedure such as a t test or cox model like below: out-coxph( Surv(tstart,tstop, death1) ~ x1+x1:log(tstop+1) , test1,method=c(breslow)) Which yields the following result: Call: coxph(formula = Surv(tstart, tstop, death1) ~ x1 + x1:log(tstop + 1), data = test1, method = c(breslow)) coef exp(coef) se(coef) z p x1 -9.58 6.89e-05 6.83 -1.40 0.16 x1:log(tstop + 1) 6.90 9.93e+02 5.63 1.23 0.22 Likelihood ratio test=2.97 on 2 df, p=0.226 n= 120 Now I simply want to create an array (let a) with the coefficients. I.e. I want a-c(-9.58, 6.90) Generally how can take the elements I want from the output matrix above for further manipulation? Thanks in advance for any answer!! ___ ×ñçóéìïðïéåßôå Yahoo!; ÂáñåèÞêáôå ôá åíï÷ëçôéêÜ ìçíýìáôá (spam); Ôï Yahoo! Mail äéáèÝôåé ôçí êáëýôåñç äõíáôÞ ðñïóôáóßá êáôÜ ôùí åíï÷ëçôéêþí ìçíõìÜôùí http://login.yahoo.com/config/mail?.intl=gr [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing factor to numeric
Try: as.numeric(as.character(x)) I usually define the following for this purpose: factor.to.number=function(x){ as.numeric(as.character(x)) } On Tue, Mar 10, 2009 at 2:25 AM, ojal john owino ojal.johnow...@googlemail.com wrote: Dear Users, I have a variable in my dataset which is of type factor. But it actually contains numeric entries which like 5.735 4.759 . This is because the data was read from a CSV file into R and this variable contained other charaters which were not numeric. I have now dropped the records with the characters which are not numeric for this variable and want to change it to numeric srotage type. I have tried using as.numeric() function but it changes the values in the variable to what I think are the ranks of the individual values of the varible in the dataset. For example if 5.735 is the current content in the field, then the new object created by as.numeric will contain a value like 680 if the 5.735 was the highest value for the varible and the dataset had 680 records. How can I change the storage type without changing the contents of this variable in this case? Thanks for your consideration. -- Ojal John Owino P.O Box 230-80108 Kilifi, Kenya. Mobile:+254 728 095 710 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Centering multi-line strip text in lattice
I'm having trouble centering multi-line strip text in lattice. As the code below demonstrates bounding box of the text is centered within the strip, but the first line isn't centered in relation to the longer second line. The adj argument to par.strip.text doesn't seem to do much. Suggestions? a=data.frame( x=rep(1:10,2) ,y=rep(1:10,2) ,z=rep(c('First Line\nLonger Second Line (1)','First Line\nLonger Second Line (2)'),each=10) ) xyplot( y~x|z ,data=a ,par.strip.text = list(cex = .75, lineheight=1, lines = 2, adj=.5), ) -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to estimate distribution?
I know it only causes a shift and scale transformation of the distribution, but it always gets my goat when I see people using the sum as a statistic of interest when the mean has a more general application (i.e. some other researcher obtains a different number of observations per subject) and a more meaningful interpretation (i.e. the mean is the answer you'd expect, on average, from the participant on any given question; the sum is ... ?) An acceptable method of fitting empirical data to theoretical distributions is maximum likelihood estimation. Sometimes there are analytic solutions, sometimes you have to use a search algorithm like optimize() (for 1 parameter distributions) or optim() (for multiple parameter distributions) to find the solution. Say you hypothesized that the red data came from an exponential distribution, you would define a function that obtains the sum log likelihood of the data given a specific rate parameter, then search a reasonable range of rate parameters for that which maximuzes the sum log liklihood: exp.sll = function(rate,x){ log.lik = dexp(x,rate,log=TRUE) sum.log.lik = sum(log.lik) print(c(rate,sum.log.lik)) return(sum.log.lik) } a=rexp(150,rate=10) hist(a,probability=TRUE) best.rate = optimize( exp.sll ,x=a ,lower=0 ,upper=1/mean(a)*10 ,maximum=TRUE )$maximum curve(dexp(x,rate=best.rate),col='red',add=T) On Wed, Mar 4, 2009 at 12:01 PM, Simone Gabbriellini simone.gabbriell...@gmail.com wrote: Dear R-Experts, I have an empirical dataset with 150 subjects for 24 observations. In each observation, each subject can have a score in the range 0:3. I made then a simple index making the sum of the values in each row, so each subject have a score between 0 and 72. I was thinking about what kind of theoretical distribution such an index should have, so I try to make things random like: data-array(0,c(150,24)) data2-array(0, c(150, 100)) for (prova in 1:100){ for (i in 1:24){ for (q in 1:150){ data[q,i]-sample(0:3, 1) } } for (riga in 1:150){ data2[riga,prova]-sum(data[riga,]) } } now here you can find the plotted theoretical values (black) against the empirical ones (red): http://www.digitaldust.it/papers/indice.png How can I estimate the density of both distribution? because the red one looks like a pareto distribution, but I am not an expert... many thanks, Simone __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] info
http://lmgtfy.com/?q=Parallel+R First entry provides an article that notes the various solutions thus far (rmpi, snow, pnmath). On Wed, Mar 4, 2009 at 7:06 AM, Enrico Giorgi giorgi_enr...@hotmail.it wrote: Dear Sir or Madam, I've been using R for one year and I really appreciate it. I would like to know if a version performing parallel computations on multicore computers and computer clusters exists. Thank you very much. Yours sincerely. Enrico Giorgi _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tinyurl.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Computing sd across an array with missing values
It just so happens that I created a vectorized SD function the other day. See the column and row versions below. If there are any rows/columns with only one non-NA value, it will return NaN. col_sd = function(x){ dimx = dim(x) x.mean = .Internal(colMeans(x,dimx[1],dimx[2],na.rm=TRUE)) err = t(t(x)-x.mean) err.sq = err*err sum.err.sq = .Internal(colSums(err.sq,dimx[1],dimx[2],na.rm=TRUE)) n = .Internal(colSums(!is.na(x),dimx[1],dimx[2],na.rm=TRUE)) sqrt(sum.err.sq/(n-1)) } row_sd = function(x){ dimx = dim(x) x.mean = .Internal(rowMeans(x,dimx[1],dimx[2],na.rm=TRUE)) err = x-x.mean err.sq = err*err sum.err.sq = .Internal(rowSums(err.sq,dimx[1],dimx[2],na.rm=TRUE)) n = .Internal(rowSums(!is.na(x),dimx[1],dimx[2],na.rm=TRUE)) sqrt(sum.err.sq/(n-1)) } On Wed, Feb 25, 2009 at 5:11 PM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Dear Matt, You're absolutely right. The reason why my code gives different results is because it calculates the standard deviation for each row/column in all the arrays rather than for every cell. My bad. You can easily get the results you posted running apply(p,1:2,sd,na.rm=TRUE) Here is another option which gives the same results you posted :-) Unfortunately there is not timing improvement :( (Note that I ran both sd1 and sd2 functions using pp instead of p) # - # System times # - pp - array(c(1:5, rep(NA, times = 3)), dim = c(1000, 1000, 60)) system.time(apply(pp, 1:2, sd1)) # user system elapsed # 93.16 0.23 94.87 system.time(apply(pp, 1:2, sd2)) # user system elapsed # 66.51 0.30 67.84 # -- # Functions # -- sd1-function(x) sd(x,na.rm=TRUE) sd2- function(i){ if(sum(!is.na(i))==0) temp.sd - NA else temp.sd - sd(i, na.rm = TRUE) temp.sd } Regards, Jorge On Wed, Feb 25, 2009 at 3:23 PM, Matt Oliver moli...@udel.edu wrote: Hi Jorge, this does not seem to return the same thing as p - array(c(1:5, rep(NA, times = 3)), dim = c(5, 5, 3)) sd_fun - function(i){ if(sum(!is.na(i))==0){ temp.sd - NA }else{ temp.sd - sd(i, na.rm = TRUE) } return(temp.sd) } apply(p, 1:2, sd_fun) Am I missing something basic here? On Wed, Feb 25, 2009 at 11:47 AM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Hi Mark, There is a typo in the first way. My apologies. It should be: # First res-apply(p,3,function(X) c(scols=apply(X,2,sd,na.rm=TRUE),srows=apply(X,1,sd,na.rm=TRUE)) ) res HTH, Jorge On Wed, Feb 25, 2009 at 11:42 AM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Dear Matt, Here are two ways: # Data p - array(c(1:5, rep(NA, times = 3)), dim = c(5, 5, 3)) # First res-apply(p,3,function(X) c(scols=apply(X,2,sd,na.rm=TRUE),srows=apply(X,2,sd,na.rm=TRUE)) ) res # Second res2-apply(p,3,function(X) list(scols=apply(X,2,sd,na.rm=TRUE),srows=apply(X,1,sd,na.rm=TRUE)) ) lapply(res2,function(x) do.call(rbind,x)) HTH, Jorge On Wed, Feb 25, 2009 at 10:53 AM, Matt Oliver moli...@udel.edu wrote: Dear help, suppose I have this array and want to compute sd aross rows and columns. p - array(c(1:5, rep(NA, times = 3)), dim = c(5, 5, 3)) apply(p, 1:2, sd) fails because sd requires at least 2 numbers to compute sd apply(p, 1:2, sd, na.rm = TRUE) fails for the same reason I crafted my own function that does what I want sd_fun - function(i){ if(sum(!is.na(i))==0){ temp.sd - NA }else{ temp.sd - sd(i, na.rm = TRUE) } return(temp.sd) } apply(p, 1:2, sd_fun) This does what I want, but when I scale up to large arrays like pp - array(c(1:5, rep(NA, times = 3)), dim = c(1000, 1000, 60)) the apply function takes a long time to run. Is there a faster, more efficient way to do this? Thanks in advance Matt -- Matthew J. Oliver Assistant Professor College of Marine and Earth Studies University of Delaware 700 Pilottown Rd. Lewes, DE, 19958 302-645-4079 http://www.ocean.udel.edu/people/profile.aspx?moliver [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew J. Oliver Assistant Professor College of Marine and Earth Studies University of Delaware 700 Pilottown Rd. Lewes, DE, 19958 302-645-4079 http://www.ocean.udel.edu/people/profile.aspx?moliver [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike
Re: [R] colored maps again
Here's a bit of a hacky way to do it: #get the names of each state state=map('state',plot=F)$names #set up some random state-color data cols = as.data.frame(cbind(state=states,x=sample(1:10,length(states),replace=T))) #do the plot map('usa') for(this_state in state){ map('state',region=this_state,add=T,fill=T,col=cols$x[cols$state==this_state]) } On Thu, Feb 19, 2009 at 12:45 PM, Alina Sheyman alina...@gmail.com wrote: I'm trying to create a colored map that would show the number of students per state. My data frame consists of two columns - state and count. I'm using the following code library(maps) map(usa) library(plotrix) state.col-color.scale(gre$count,0,0,c(0,1)) map(state,fill=TRUE,col=state.col) I'm getting a map, but the values are not being mapped to correct states. What do I need to do to fix that? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error bars
Check out ggplot2: http://had.co.nz/ggplot2/ Particularly: http://had.co.nz/ggplot2/geom_errorbar.html If you need more help you'll have to provide a self-contained set of data/code. Hopefully you'll soon be completely rid of this strange excel of which you speak :Op On Wed, Feb 18, 2009 at 9:38 PM, Nicole Hackman nicole.hack...@gmail.com wrote: Hello, I have a very simple data set i imported from excel including 96 averages in a column along with 96 standard errors associated with those averages (calculated in excel). I plotted the 95 averages using r and I am wondering if it is possible to plot the second column of standard errors while applying error bars to each value so they represent the error corresponding to each average? thanks, Nicole -- Nicole Hackman Master of Science Student University of Washington College of Forest Resources Box 352100 n...@u.washington.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anova help
Provide some example data code and someone may be able to help. On Sat, Feb 14, 2009 at 2:46 PM, Joe King j...@joepking.com wrote: Hi all, I am trying to run a two factor anova, but one of the factors is a random factor, now I am also running in SPSS and it seems its dividing by the wrong term to get the appropriate F term. here is my data. In SPSS the F scores about double the ones in R, how can I specify one of my factors as a random factor or change it to where it does the right model fitting? I am using the lm command instead of glm. I am new to R so this might seem basic. Joe King, M.A. mailto:j...@joepking.com j...@joepking.com Never give in, never give in, never; never; never; never - in nothing, great or small, large or petty - never give in except to convictions of honor and good sense - Winston Churchill You have enemies? Good. That means you've stood up for something, sometime in your life. - Winston Churchill [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed ANCOVA with between-Ss covariate?
I'm surprised there were no takers on this query; I thought it would be an easy answer, particularly where I provided example data set and code. Did my request run afoul of the list etiquette? On Tue, Feb 10, 2009 at 5:12 PM, Mike Lawrence m...@thatmike.com wrote: Hi all, I have data from an experiment with 3 independent variables, 2 are within and 1 is between. In addition to the dependent variable, I have a covariate that is a single measure per subject. Below I provide an example generated data set and my approach to implementing the ANCOVA. However the output confuses me; why does the covariate only appear in the first strata? Presumably it should appear in every strata in which there can be between-Ss effects or interactions with between-Ss effects, no? #generate data set.seed(1) a=rbind( expand.grid( id=1:20 ,iv1 = 1:2 ,iv2 = 1:2 ) ) a$group = factor(a$id11) a$dv = rnorm(length(a[,1])) a$covariate = NA for(i in unique(a$id)){ a$covariate[a$id==i]= rnorm(1) } #make sure id, iv1, and iv2 are factorized a$id=factor(a$id) a$iv1=factor(a$iv1) a$iv2=factor(a$iv2) #run ANCOVA covariate_aov = aov(dv~covariate+group*iv1*iv2+Error(id/(iv1*iv2)),data=a) summary(covariate_aov) -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed ANCOVA with between-Ss covariate?
Hi all, I have data from an experiment with 3 independent variables, 2 are within and 1 is between. In addition to the dependent variable, I have a covariate that is a single measure per subject. Below I provide an example generated data set and my approach to implementing the ANCOVA. However the output confuses me; why does the covariate only appear in the first strata? Presumably it should appear in every strata in which there can be between-Ss effects or interactions with between-Ss effects, no? #generate data set.seed(1) a=rbind( expand.grid( id=1:20 ,iv1 = 1:2 ,iv2 = 1:2 ) ) a$group = factor(a$id11) a$dv = rnorm(length(a[,1])) a$covariate = NA for(i in unique(a$id)){ a$covariate[a$id==i]= rnorm(1) } #make sure id, iv1, and iv2 are factorized a$id=factor(a$id) a$iv1=factor(a$iv1) a$iv2=factor(a$iv2) #run ANCOVA covariate_aov = aov(dv~covariate+group*iv1*iv2+Error(id/(iv1*iv2)),data=a) summary(covariate_aov) -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme() direction
Hi guRus, I'm looking for advice on a good way to approach analysis of some multi-level data I've obtained. I had humans classify words and measured response time. Words were 10 positive words (happy, joy, etc) and 10 negative words (sad,grumpy, etc). Words were also presented in either white, red or green color. All variables were manipulated within-Ss and for each word-color combination I collected 10 observations. So the data would be something like: set.seed(1) a=rbind( cbind( type='positive' ,expand.grid( id=1:10 ,color=c('white','red','green') ,word=c('happy','joy') ,repetition = 1:10 ) ) ,cbind( type='negative' ,expand.grid( id=1:10 ,color=c('white','red','green') ,word=c('sad','grumpy') ,repetition = 1:10 ) ) ) #add some fake rt data a$rt=rnorm(length(a[,1])) #And because people make errors sometimes: a$error = rbinom(length(a[,1]),1,.1) #remove error trials because they're not psychologically interesting: a=a[a$error==0,] I'm most interested in the interaction between color and type, but I know that there is likely an effect of word. Yet since word is not completely crossed with type, simply adding it to an aov() won't work. A colleague recommended I look into lme() but so far I can't figure out the proper call. Another issue is whether to collapse across repetition before running the stats, particularly since errors will leave unequal numbers of observations per cell if it's left in. Any advice anyone can provide would be great! Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme() direction
Would it improve things if type were a continuous variable rather than categorical? I chose words at the extreme ends of a valence rating scale but I still have the raw valence ratings for each word. On Sat, Feb 7, 2009 at 12:02 PM, Dieter Menne dieter.me...@menne-biomed.de wrote: Mike Lawrence mike at thatmike.com writes: Thanks for the excellent reproducible sample set! I'm most interested in the interaction between color and type, but I know that there is likely an effect of word. Yet since word is not completely crossed with type, simply adding it to an aov() won't work. A colleague recommended I look into lme() but so far I can't figure out the proper call. Without word, it would be summary(lme(rt~type*color, data=a,random=~1|id)) With the interaction, the extreme would be summary(lme(rt~type*color*word, data=a,random=~1|id)) or, less extreme summary(lme(rt~type*color+color:word, data=a,random=~1|id)) but all these fail because of the rather degenerate structure of you data set. While lmer in package lme4 allows for a wider set of solutions, I currently do not see how it could help, but I might be wrong with p=0.5. word happy joy sad grumpy type color positive white 93 90 0 0 red 90 88 0 0 green 88 87 0 0 negative white 0 0 88 95 red0 0 91 85 green 0 0 88 88 Another issue is whether to collapse across repetition before running the stats, particularly since errors will leave unequal numbers of observations per cell if it's left in. That's one of the points where you have little to bother with the lme approach. Collapsing would give equal weights to unequal numbers of repeat, and might of minor importance when not too extreme, though. Dieter set.seed(1) a=rbind( cbind( type='positive' ,expand.grid( id=1:10 ,color=c('white','red','green') ,word=c('happy','joy') ,repetition = 1:10 ) ) ,cbind( type='negative' ,expand.grid( id=1:10 ,color=c('white','red','green') ,word=c('sad','grumpy') ,repetition = 1:10 ) ) ) #add some fake rt data a$rt=rnorm(length(a[,1])) #And because people make errors sometimes: a$error = rbinom(length(a[,1]),1,.1) #remove error trials because they're not psychologically interesting: a=a[a$error==0,] library(nlme) ftable(a[,c(1,3,4)]) summary(lme(rt~type*color, data=a,random=~1|id)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme() direction
And if I decided to ignore the type variable altogether and simply use the continuous valence variable, this is what I'd use? summary(lme( fixed = rt~valence*color , data = a ,random = ~1|id )) I also have continuous luminance measurements of each color that vary from participant to participant (we used different monitors). If I were interested in luminance *instead* of color, would the following be appropriate, or do I need to do something special to account for the fact that each participant has different values of luminance? summary(lme( fixed = rt~valence*luminance , data = a ,random = ~1|id )) On Sat, Feb 7, 2009 at 12:34 PM, Dieter Menne dieter.me...@menne-biomed.de wrote: Mike Lawrence mike at thatmike.com writes: Would it improve things if type were a continuous variable rather than categorical? I chose words at the extreme ends of a valence rating scale but I still have the raw valence ratings for each word. With the interaction, the extreme would be summary(lme(rt~type*color*word, data=a,random=~1|id)) or, less extreme summary(lme(rt~type*color+color:word, data=a,random=~1|id)) .. Something like summary(lme(rt~type*color+color:as.numeric(word), data=a,random=~1|id)) (please replace as.numeric() by the raw valence, the example above it simply wrong) could gain you a few degrees of freedom if you are willing to accept the linear hypothesis. And as there is something like raw valence, one should not throw away details about a-priori ordering in favor of a categorical hypothesis. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems in Recommending R
One of my colleagues is a interdisciplinary PhD in Design and Psychology and he has an in with a design school where we might be able to get students to take on the redesign of the website. He asks: In order to ensure efficient consumption of resources and maximize our return on investment, please provide potential designers with a direct point of contact (name, email, telephone number) so that they may request a project description and feedback. Obviously the redesign idea has been generated in a community thread, but if anyone from the R foundation can step up as such a contact person I will forward your info to my colleague who will then take the temperature of students at the design school. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems in Recommending R
One of my colleagues is a interdisciplinary PhD in Design and Psychology and he has an in with a design school where we might be able to get students to take on the redesign of the website. He asks: In order to ensure efficient consumption of resources and maximize our return on investment, please provide potential designers with a direct point of contact (name, email, telephone number) so that they may request a project description and feedback. Obviously the redesign idea has been generated in a community thread, but if anyone from the R foundation can step up as such a contact person I will forward your info to my colleague who will then take the temperature of students at the design school. Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] From z to Rho
Do these functions help? #Fisher's r-to-z: fr2z - function(r) atanh(r) #Fisher's z-to-r: fz2r - function(z) tanh(z) On Fri, Jan 30, 2009 at 9:29 AM, LE PAPE Gilles lepape.gil...@neuf.fr wrote: Hi, when performing a spearman_test stratified by a given factor in package coin, how is it possible to obtain the value of Rho, the Spearman correlation coefficient ? Thanks in advance Gilles (F) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with time series
To seperate the columns, use the sep argument in read.table() mystage - read.table(C:\\Documents and Settings\\skfriedman\\Desktop\\R-scripts\\stage.txt, header = TRUE,sep=',') On Fri, Jan 30, 2009 at 4:17 PM, steve_fried...@nps.gov wrote: Hello everyone I'm working with R 2.8.1 on a windows machine I have a question regarding time series analysis The first question is how does R expect the input file to be structured? I'm working with a *.txt file similar to the abbreviated one here: Date,stage 4/2/1953,7.56 4/3/1953,7.56 4/4/1953,7.54 4/5/1953,7.53 4/6/1953,7.5 4/7/1953,7.47 4/8/1953,7.44 4/9/1953,7.41 4/10/1953,7.37 4/11/1953,7.33 4/12/1953,7.3 4/13/1953,7.26 4/14/1953,7.28 4/15/1953,7.28 4/16/1953,7.23 4/17/1953,7.47 4/18/1953,7.59 4/19/1953,7.58 4/20/1953,7.57 4/21/1953,7.56 4/22/1953,7.55 4/23/1953,7.53 4/24/1953,7.51 4/25/1953,7.48 4/26/1953,7.46 4/27/1953,7.5 4/28/1953,7.56 The data record is substantially longer - 50 years worth of daily hydrologic water stage data (column 2). R seems to get confused by the format of this input not knowing what to do with the date field, and also deciding to treat everything as a level. I'm reading the data as follows: mystage - read.table(C:\\Documents and Settings\\skfriedman\\Desktop\\R-scripts\\stage.txt, header = TRUE) looking at the data I get the following: mystage[1:4,] [1] 4/2/1953,7.56 4/3/1953,7.56 4/4/1953,7.54 4/5/1953,7.53 20195 Levels: 1/1/1954,8.72 1/1/1955,8.48 1/1/1956,7.94 1/1/1957,7.88 1/1/1958,8.5 ... 9/9/2007,8.84 What I'd like is a time series with a starting data of April 21, 1953. ending December 30, 2008. data are daily records, so the frequency should be 365 (?) not counting leap year nuisances. So the first question is how should I build the input file to correctly import it to a time series with an odd beginning date? The analysis I'm really trying to get to will involve calculating the mean monthly stage, the mean seasonal (aggregated over several months) stage, the annual maximum period with a continuous stage greater than 0. Thanks in advance I will summary solutions. Much appreciated Steve Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with normal distribution in random samples...
?rnorm On Wed, Jan 28, 2009 at 4:04 PM, Sea Captain 1779 dbo...@measinc.com wrote: Hi!!! First time 'R' user looking for a little assistance. Here is what I have so far: practice1 = matrix ((runif(5000, min =0, max = 12)), 100) which is creating 50 samples, for 100 cases, distributed between 0-12. What I would like is to be able to set the mean and SD so that the data is normally distributed around lets say 7. Any help I can get with achieving that goal would be greatly appreciated!!! -Dan -- View this message in context: http://www.nabble.com/Help-with-normal-distribution-in-random-samples...-tp21713636p21713636.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mode (statistics) in R?
Here's a rather convoluted way of finding the mode (or, at least, the first mode): x = round(rnorm(100,sd=5)) my_mode = as.numeric(names(table(x))[which.max(table(x))]) On Mon, Jan 26, 2009 at 9:28 AM, Jason Rupert jasonkrup...@yahoo.com wrote: Hopefully this is a pretty simple question: Is there a function in R that calculates the mode of a sample? That is, I would like to be able to determine the value that occurs the most frequently in a data set. I tried the default R mode function, but it appears to provide a storage type or something else. I tried the RSeek and some R documentation that I downloaded, but nothing seems to mention calculating the mode. Thanks again. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R for Computational Neuroscience?
Thanks to Ken and Bernardo for their attempts to answer my question, but I was apparently unclear as to what I meant by computational neuroscience. The tools Ken and Bernardo suggest provide means to analyze data from neuroscience research, but I'm actually looking for means to simulate biologically realistic neural systems. Maybe it would help if I provided some keywords, so here is a list of chapters/sections from MATLAB section of the book How the brain computes: Network fundamentals of computational neuroscience by Thomas Trappenberg: 12 A MATLAB guide to computational neuroscience 12.1 Introduction to the MATLAB programming environ- ment ... 12.2 Spiking neurons and numerical integration in MAT- LAB 12.2.1 Integrating Hodgkin-Huxley equations with the Euler method 12.2.2 The Wilson model and advanced integration 12.2.3 MATLAB function files 12.2.4 Leaky integrate-and-fire neuron 12.2.5 Poisson spike trains 12.2.6 Netlet formulas by Anninos et al. 12.3 Associators and Hebbian Learning 12.3.1 Hebbian Weight matrix in rate models 12.3.2 Hebbian learning with weight decay 12.4 Recurrent networks and networks dynamics 12.4.1 Example of a complete network simulation 12.4.2 Quasi-continuous attractor network 12.4.3 Networks with random asymmetric weight ma- trix 12.4.4 The Lorenz attractor 12.5 Continuous attractor neural networks 12.5.1 Path-integration 12.6 Error-backpropagation network -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University www.thatmike.com Looking to arrange a meeting? Check my public calendar: http://www.thatmike.com/mikes-public-calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.