from:"Mike Lawrence"

Re: [R] Help optimizing EMD::extrema()

2011-02-13 Thread Mike Lawrence

Wow, great work! This makes a huge difference (i.e. 5 mins to process
an electrode instead of 5 hours).

I believe that the ndata and ndatam1 are there to let EMD::emd() and
EMD::extractimf() (the functions that call EMD::extrema()) implement
various boundary corrections. I personally don't intend on examining
the start or end in great detail, so I think the uncorrected version
is fine for my application.

Thanks again!

Mike


On Sun, Feb 13, 2011 at 6:57 PM, William Dunlap wdun...@tibco.com wrote:
 I did not try to emulate the ndata nad ndatam1 arguments
 to extrema(), as I didn't see what they were for.  The
 help file simply says they are the length of the first
 argument and that minus 1 and that is what their default
 values are.  If they do not have their default values
 then extrema() frequently dies.  You could add them them
 to the argument list and not use them, or check that
 they are what they default to, as in
   function(x, ndata, ndatam1) {
      stopifnot(length(x)==ndata, ndata-1==ndatam1)
      ... rest of code ...
 If the check fails then someone needs to say or figure out
 what they are for.

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
 Sent: Sunday, February 13, 2011 2:08 PM
 To: Mike Lawrence; r-h...@lists.r-project.org
 Subject: Re: [R] Help optimizing EMD::extrema()

  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of Mike Lawrence
  Sent: Friday, February 11, 2011 9:28 AM
  To: r-h...@lists.r-project.org
  Subject: [R] Help optimizing EMD::extrema()
 
  Hi folks,
 
  I'm attempting to use the EMD package to analyze some neuroimaging
  data (timeseries with 64 channels sampled across 1 million
 time points
  within each of 20 people). I found that processing a single
 channel of
  data using EMD::emd() took about 8 hours. Exploration using Rprof()
  suggested that most of the compute time was spent in EMD::extrema().
  Looking at the code for EMD:extrema(), I managed to find one obvious
  speedup (switching from employing rbind() to c()) and I suspect that
  there may be a way to further speed things up by pre-allocating all
  the objects that are currently being created with c(), but
 I'm having
  trouble understanding the code sufficiently to know
 when/where to try
  this and what sizes to set as the default pre-allocation
 length. Below
  I include code that demonstrates the speedup I achieved by
 eliminating
  calls to rbind(), and also demonstrates that only a few calls to c()
  seem to be responsible for most of the compute time. The files
  extrema_c.R and extrema_c2.R are available at:
  https://gist.github.com/822691

 Try the following new.extrema().  It is based on
 looking at runs in the data in a vectorized way.
 On my old laptop the running times for length(x)=2^(2:118)
 with EMD::extrema and new.extrema are
        old.time new.time
 4          0.00     0.00
 8          0.00     0.00
 16         0.00     0.00
 32         0.00     0.00
 64         0.00     0.00
 128        0.00     0.00
 256        0.00     0.00
 512        0.02     0.00
 1024       0.03     0.00
 2048       0.06     0.01
 4096       0.14     0.00
 8192       0.37     0.02
 16384      1.08     0.03
 32768      3.64     0.06
 65536     13.35     0.12
 131072    48.42     0.25
 262144   206.17     0.59

 isEndOfStrictlyIncreasingRun - function(x) {
     c(FALSE, diff(diff(x)  0)  0, FALSE)
 }

 isStartOfStrictlyIncreasingRun - function(x) {
     c(FALSE, diff(diff(x) = 0)  0, FALSE)
 }

 isEndOfStrictlyDecreasingRun - function(x) {
     isEndOfStrictlyIncreasingRun(-x)
 }

 isStartOfStrictlyDecreasingRun - function(x) {
     isStartOfStrictlyIncreasingRun(-x)
 }

 isStartOfZeroRun - function(x) {
     x==0  c(TRUE, x[-length(x)]!=0)
 }

 nToEndOfCurrentFlatRun - function(x) {
     # for each element of x, how far to end of current
     # run of equal values.
     rev( sequence(rle(rev(x))$lengths) - 1L)
 }

 indexOfEndOfCurrentFlatRun - function(x) {
     # should be a more direct way of doing this, but this is pretty
 quick
     seq_len(length(x)) + nToEndOfCurrentFlatRun(x)
 }

 new.extrema - function(x) {
     f - indexOfEndOfCurrentFlatRun(x)
     isMaxStart - isEndOfStrictlyIncreasingRun(x) 
 isStartOfStrictlyDecreasingRun(x)[f]
     maxindex - cbind(which(isMaxStart), f[isMaxStart],
 deparse.level=0)

     isMinStart - isEndOfStrictlyDecreasingRun(x) 
 isStartOfStrictlyIncreasingRun(x)[f]
     minindex - cbind(which(isMinStart), f[isMinStart],
 deparse.level=0)


     # zero-crossings are bit weird: Report either the
 before-zero/after-zero
     # pair or the start and stop of a a run of zero's (even if the run
 is
     # not part of an actual zero-crossing).  Do them separately then
 sort.
     # Also, if the entire sequence never actually crosses 0, do
     # not report the zero

[R] Help optimizing EMD::extrema()

2011-02-11 Thread Mike Lawrence

Hi folks,

I'm attempting to use the EMD package to analyze some neuroimaging
data (timeseries with 64 channels sampled across 1 million time points
within each of 20 people). I found that processing a single channel of
data using EMD::emd() took about 8 hours. Exploration using Rprof()
suggested that most of the compute time was spent in EMD::extrema().
Looking at the code for EMD:extrema(), I managed to find one obvious
speedup (switching from employing rbind() to c()) and I suspect that
there may be a way to further speed things up by pre-allocating all
the objects that are currently being created with c(), but I'm having
trouble understanding the code sufficiently to know when/where to try
this and what sizes to set as the default pre-allocation length. Below
I include code that demonstrates the speedup I achieved by eliminating
calls to rbind(), and also demonstrates that only a few calls to c()
seem to be responsible for most of the compute time. The files
extrema_c.R and extrema_c2.R are available at:
https://gist.github.com/822691

Any suggestions/help would be greatly appreciated.


#load the EMD library for the default version of extrema
library(EMD)

#some data to process
values = rnorm(1e4)

#profile the default version of extrema
Rprof(tmp - tempfile())
temp = extrema(values)
Rprof()
summaryRprof(tmp)
#1.2s total with most time spend doing rbind
unlink(tmp)

#load a rbind-free version of extrema
source('extrema_c.R')
Rprof(tmp - tempfile())
temp = extrema_c(values)
Rprof()
summaryRprof(tmp) #much faster! .5s total
unlink(tmp)

#still, it encounters slowdowns with lots of data
values = rnorm(1e5)
Rprof(tmp - tempfile())
temp = extrema_c(values)
Rprof()
summaryRprof(tmp)
#44s total, hard to see what's taking up so much time
unlink(tmp)

#load an rbind-free version of extrema that labels each call to c()
source('extrema_c2.R')
Rprof(tmp - tempfile())
temp = extrema_c2(values)
Rprof()
summaryRprof(tmp)
#same time as above, but now we see that it spends more time in
certain calls to c() than others
unlink(tmp)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] ez version 3.0

2011-02-11 Thread Mike Lawrence

Hi folks,

I'd like to announce the release of version 3.0 of the ez package.
This package was developed to aid those that are new to statistical
programming. Over the course of several years of helping colleagues
and students learn R, I observed that folks are often initially turned
off R because they have difficulty obtaining SPSS-like results quickly
(SPSS is the dominant environment in my field, psychology). ez
attempts to fill this gap, providing quick and easy analysis and
graphics for common experimental designs. By easing the early portions
of the R learning curve, ez hopes to promote the spread of R as a
means of open source and reproducible analysis.

ez may also be of interest to more advanced users as it includes the
ezMixed() function, which automates the assessment of fixed effects
in a mixed effects modelling context, and the ezPredict() function,
which obtains predictions for the fixed effects from a mixed effects
model.


Installing ez

Version 3.0 of ez requires that you have R 2.12.1 installed, which is
the latest version of R as of this notice. If you have an older
version of R you will need to update R by installing the latest
version (from http://cran.r-project.org/) before installing ez.

Windows and linux users should be able to install the latest version
by running the command:
install.packages( 'ez' )

Mac users should be able to install the latest version by running the commands:
install.packages( c( 'car' , 'reshape2' , 'plyr' , 'ggplot2' ,
'stringr' , 'lme4' , 'Matrix' ) )
install.packages( 'ez' , type='source' , dependencies=F )


Once installed, running the following commands will load ez and bring
up its help page that links to descriptions of all ez's functions:
library( ez )
?ez


Big changes in version 3.0

- A big rework of ezANOVA() to permit more flexibility, including
more nuanced handling of numeric predictor variables, specification of
sums-of-squares types when data is imbalanced, and an option to
compute/return an aov object representing the requested ANOVA for
follow-up contrast analysis. (The latter two features follow from the
discussion at 
http://stats.stackexchange.com/questions/6208/should-i-include-an-argument-to-request-type-iii-sums-of-squares-in-ezanova)

- An important bugfix for ezMixed(), which previously permitted
specification of multiple random effects but silently ignored all but
the last!

- A big rework of ezMixed(), completely changing the output
(including removal of p-values following advice of Pinero  Bates,
2000, and many on the R-SIG-Mixed-Models mailing list) and providing a
new feature whereby the linearity of numeric fixed effects can be
assessed by comparison to higher polynomial degrees.


Also new

As noted above, this version fixes a big bug in ezMixed() about
which I wish I could have warned users sooner. To facilitate future
rapid notification of users, I've created a discussion group
(http://groups.google.com/group/ez4r) to which users can/should
subscribe to keep up to date on development news. Additionally, I
created a project page on github
(https://github.com/mike-lawrence/ez/issues) where users can submit
bug reports and feature requests. Finally, I encourage users to rate
or review ez (and any other packages you use) at crantastic.org
(http://crantastic.org/packages/ez).


Official CHANGES entry
3.0-0
- added urls in all documentation pointing to the
bug-report/feature-request site
(https://github.com/mike-lawrence/ez/issues) and the discussion group
(http://groups.google.com/group/ez4r).
- changed reshape dependency to reshape2
- ezANOVA
- fixed bug such that if detailed=F and car:Anova fails, the first
line of results from stats:aov is cut off.
- Added more nuanced treatment of numeric variables
- Added type argument to specify sums-of-squares type selection (1,2, or 3)
- Added white.adjust argument to permit heteroscedasticity
adjustment (see ?car::Anova)
- Added return_aov argument to permit returning a stats:aov object
along with results (this was requested by a user seeking to use the
aov object to compute contrasts)
- ezMixed
- IMPORTANT: Fixed bug such that only the last specified random
effect was implemented
- fixed bug such that an error occurred if only one fixed effect
was specified
- changed output format to a list containing a summary data frame,
a list of formulae, a list of errors, a list of warnings, and
(optionally) a list of fitted models
- Changed format of summary output including removal of p-values
(on the strong advice of many that the p-values from a likelihood
ratio test of a fixed effect is highly questionable)
- removed the return_anovas argument
- added nuanced ability to explore non-linear effects via addition
of fixed_poly and fixed_poly_max arguments
- ezPredict
- Added ability to handle models fit with I()'d variables
- Added stop error when encountering models with poly() and no
supplied prediction data

[R] hierarchical mixture modelling

2010-11-22 Thread Mike Lawrence

Hi folks,

I have circular data that I'd like to model as a mixture of a uniform
and a Von Mises centered on zero. The data are from 2 groups of human
participants, where each participant is measured repeatedly within
each of several conditions. So, the data frame would look something
like:


design = expand.grid(
person = 1:20
, variable1 = 1:2
, variable2 = 1:3
, repetition = 1:100
)
design$group = design$person %% 2


where each row would have a data point.

Now, I know how to fit the mixture of a uniform and Von Mises to each
individual cell of the data independently using the EM algorithm,
yielding estimates of the mixture proportion and Von Mises
concentration per cell. However, I of course want to assess the degree
to which the group and other variables in the design affect these
model parameters, and at least in the case of the proportion estimate,
I'm uncomfortable submitting the raw proportion to a test that is
going to assume Gaussian error (eg. ANOVA, or lmer(...,
family=gaussian)). I'm aware that lmer lets one specify non-gaussian
links, but as I understand it, if I wanted to, say, specify the
binomial link (which seems appropriate for a proportion), lmer wants
the data to be the raw 1's and 0's, not the proportion estimate
obtained from EM.

I've heard that there are hierarchical mixture modelling methods out
there (possibly Bayesian hierarchical mixture modelling) that might
let me achieve model fitting and inference in one step (eg. model the
mixture and influence on each parameter from the between and
within-person variables, and treating people as random effects), but
I'm having trouble tacking down instructions on how to do this.

Any pointers would be greatly appreciated!

Cheers,

Mike

--
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] ez version 2.0

2010-09-01 Thread Mike Lawrence

The ez package was developed to aid those that are new to statistical
programming. Over the course of several years of helping colleagues
and students learn R, I observed that folks are often initially turned
off R because they have difficulty obtaining SPSS-like results quickly
(SPSS is the dominant environment in my field, psychology). ez
attempts to fill this gap, providing quick and easy analysis and
graphics for common experimental designs. By easing the early portions
of the R learning curve, ez hopes to promote the spread of R as a
means of open source and reproducible analysis. ez now also attempts
to pique interest in cutting-edge statistical methods by providing
easy specification and visualization of simple* mixed effects models.

*-- mixed effects models are limited to those with a single random
effect (eg. Participant) and no numeric predictors.


New in version 2.0
- ezDesign(), a function to visualize the balance of data across a
specified experimental design (useful for diagnosing missing data)
- ezPrecis(), a function to summarize a data set (inspired by
summary(), str(), and YaleToolkit:whatis() )
- ezBoot() and ezPlotBoot(), functions to compute and visualize
(respectively) bootstrapped confidence intervals for either cell means
or predictions from a mixed effects model
- ezANOVA() updated with an improved measure of effect size:
generalized eta-square.
- ezPlot() updated to permit simultaneous plotting of multiple DV's,
with each mapped to a row of the plot facets.
- see changelog for further changes


Enjoy!

Mike

--
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] compare gam fits

2010-08-05 Thread Mike Lawrence

Hi folks,

I originally tried R-SIG-Mixed-Models for this one
(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q3/004170.html),
but I think that the final steps to a solution aren't mixed-model
specific, so I thought I'd ask my final questions here.

I used gamm4 to fit a generalized additive mixed model to data from a
AxBxC design, where A is a random effect (human participants in an
experiment), B is a 2-level factor predictor variable, and C is a
continuous variable that is likely non-linear. I tell gamm4 to fit a
smooth across C to each level of B independently, and I can use
predict.gam(...,se.fit=T) to obtain predictions from the fitted model
as well as the standard error for the prediction. I'd like to
visualize the BxC interaction to see if smoothing C within each level
of B was really necessary, and if so, where it is (along the C
dimension) that B affects the smooth. It's easy enough to obtain the
predicted B1-B2 difference function, but I'm stuck on how to convey
the uncertainty of this function (e.g. computing the confidence
interval of the difference at each value of C).

One thought is that predict.gam(...,se.fit=T) returns SE values, so if
I could find out the N on which these SE values are computed, I could
compute the difference CI as
sqrt( ( (SD_B1)^2 + (SD_B2)^2 ) / N ) * qt( .975, df=N-1 )

However, I can't seem to figure out what value of N was used to
compute the SEs that predict.gam(...,se.fit=T) produces. Can anyone
point me to where I might find N?

Further, is N-1 the proper df for the call to qt()?

Finally, with a smooth function and 95% confidence intervals computed
at each of a large number of points, don't I run into a problem of an
inflated Type I error rate? Or does the fact that each point is not
independent from those next to it make this an inappropriate concern?

Cheers,

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)

2009-11-10 Thread Mike Lawrence

Check out the reshape package for transforming data from long to wide
and vice versa.

Yet I still don't know what problem you've encountered with ezANOVA.
Using the data you just sent, where Day now has 3 levels, I reformat
back to the presumably original long format and find that ezANOVA
returns the same sphericity tests as John's solution (which is
expected because ezANOVA is a wrapper to Anova):

library(ez)
a = read.table( 'Sergios_wide_data.txt' , header=T )
b = melt.data.frame( data=a , id.vars=c('subject','treatment') ,
variable_name='day' )
ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(treatment,day) )



On Mon, Nov 9, 2009 at 10:20 PM, Sergios (Sergey) Charntikov
sergios...@gmail.com wrote:
 Thank you very much.  Finally got it to work.  However, I had to recode it 
 from:
 columns: subject/treatment/DV (where all my response data was in one
 DV column) to columns: subject/treatment/day1/day2/day3/ (where my
 response data is now in three different columns).

 Is there a way to do that without hand recoding (cutting and pasting
 in spreadsheet) by hand? Thank you for your help.  Glad it works as
 is.


 Sincerely,

 Sergios Charntikov (Sergey), MA

 Behavioral Neuropharmacology Lab
 Department of Psychology
 University of Nebraska-Lincoln
 Lincoln, NE 68588-0308  USA





 On Mon, Nov 9, 2009 at 7:12 PM, John Fox j...@mcmaster.ca wrote:
 Dear Sergios,

 Why don't you try what I suggested originally? Adapted to this data set,

 mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset)
 idata - data.frame(Day=factor(1:3))
 summary(Anova(mod, idata=idata, idesign=~Day))

 Peter Dalgaard also pointed toward an article that describes how to do the
 same thing with anova().

 Regards,
  John

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of Sergios (Sergey) Charntikov
 Sent: November-09-09 7:13 PM
 To: Mike Lawrence
 Cc: r-help@r-project.org
 Subject: Re: [R] Getting Sphericity Tests for Within Subject Repeated
 Measure
 Anova (using car package)

 Hi Mike,

 I tried to run my data in SPSS and it works fine without any problems,
 plug
 in my levels, plug in my covariate (since it is all within) and get my
 Mauchly Tests.

 I tried to rearrange the data so it looks like this

 subj/treatment/day1/day2/day3

 subject    treatment    day1    day2    day3
 1    1    8    8    8
 1    2    5    7    5
 2    1    7    4    4
 2    2    4    5    7
 3    1    8    6    4
 3    2    5    2    4
 4    1    2    9    4
 4    2    1    9    1
 5    1    4    8    1
 5    2    7    8    2
 6    1    4    7    2
 6    2    4    5    2


 When I try mlmfit - lm(Dataset~1), I get invalid type (list) for
 variable
 'Dataset

 When I try

 mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset)

 idata- data.frame(factor(rep(c(Dataset$day1, Dataset$day2,
 Dataset$day3))),
 ordered(Dataset$Treatment))

 Anova(mod, idata=idata, idesign=~Dataset$Treatment)

 I get: Terms in the intra-subject model matrix are not orthogonal.

 When I try is.matrix(Dataset) - I get no.

 My original mock Dataset (attached in txt) is below.  Maybe I am not
 coding
 it right? I would hate to recode all my data for SPSS, since at the end I
 would need to show that Sphericity was not violated.

 Subj  Trtmt   Sessn   Response

 1     N       1       5

 1     D       1       6

 1     N       2       4

 1     D       2       7

 2     N       1       8

 2     D       1       9

 2     N       2       2

 2     D       2       1

 3     N       1       4

 3     D       1       5

 3     N       2       6

 3     D       2       2

 4     N       1       5

 4     D       1       6

 4     N       2       4

 4     D       2       7

 5     N       1       8

 5     D       1       9

 5     N       2       2

 5     D       2       1

 6     N       1       4

 6     D       1       5

 6     N       2       6

 6     D       2       2




 Sincerely,

 Sergios Charntikov (Sergey), MA

 Behavioral Neuropharmacology Lab
 Department of Psychology
 University of Nebraska-Lincoln
 Lincoln, NE 68588-0308  USA



 On Mon, Nov 9, 2009 at 5:29 PM, Mike Lawrence mike.lawre...@dal.ca
 wrote:
 
  No luck as in...? What error did you encounter?
 
  In your example data set, you only have 2 levels of each within-Ss
  factor, in which case you shouldn't expect to obtain tests of
  sphericity; as far as I understand it, sphericity necessarily holds
  when for repeated measures with only 2 levels and tests are really
  only possible for repeated measures with 3 or more levels.
 
  I think it's analogous to how you don't need to test homogeneity of
  variance when performing a paired t-test; the test ends up
  representing the pairs as single distribution of difference scores
  with a single variance.
 
  Mike
 
  On Mon, Nov 9, 2009 at 5:30 PM, Sergios (Sergey) Charntikov
  sergios...@gmail.com wrote:
   Tried EZanova, no luck with my particular dataset

Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)

2009-11-10 Thread Mike Lawrence

Oops, I see now that despite repeated subject names, treatment is a
between-Ss variable, so you need to use this to get the equivalent of
Anova:

library(ez)
a = read.table( 'Sergios_wide_data.txt' , header=T )
b = melt.data.frame( data=a , id.vars=c('subject','treatment') ,
variable_name='day' )
b$subject=paste(b$subject,b$treatment) #create unique sids
ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(day) ,
between=.(treatment) )




On Tue, Nov 10, 2009 at 6:47 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 Check out the reshape package for transforming data from long to wide
 and vice versa.

 Yet I still don't know what problem you've encountered with ezANOVA.
 Using the data you just sent, where Day now has 3 levels, I reformat
 back to the presumably original long format and find that ezANOVA
 returns the same sphericity tests as John's solution (which is
 expected because ezANOVA is a wrapper to Anova):

 library(ez)
 a = read.table( 'Sergios_wide_data.txt' , header=T )
 b = melt.data.frame( data=a , id.vars=c('subject','treatment') ,
 variable_name='day' )
 ezANOVA( data=b , dv=.(value) , sid=.(subject) , within=.(treatment,day) )



 On Mon, Nov 9, 2009 at 10:20 PM, Sergios (Sergey) Charntikov
 sergios...@gmail.com wrote:
 Thank you very much.  Finally got it to work.  However, I had to recode it 
 from:
 columns: subject/treatment/DV (where all my response data was in one
 DV column) to columns: subject/treatment/day1/day2/day3/ (where my
 response data is now in three different columns).

 Is there a way to do that without hand recoding (cutting and pasting
 in spreadsheet) by hand? Thank you for your help.  Glad it works as
 is.


 Sincerely,

 Sergios Charntikov (Sergey), MA

 Behavioral Neuropharmacology Lab
 Department of Psychology
 University of Nebraska-Lincoln
 Lincoln, NE 68588-0308  USA





 On Mon, Nov 9, 2009 at 7:12 PM, John Fox j...@mcmaster.ca wrote:
 Dear Sergios,

 Why don't you try what I suggested originally? Adapted to this data set,

 mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset)
 idata - data.frame(Day=factor(1:3))
 summary(Anova(mod, idata=idata, idesign=~Day))

 Peter Dalgaard also pointed toward an article that describes how to do the
 same thing with anova().

 Regards,
  John

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of Sergios (Sergey) Charntikov
 Sent: November-09-09 7:13 PM
 To: Mike Lawrence
 Cc: r-help@r-project.org
 Subject: Re: [R] Getting Sphericity Tests for Within Subject Repeated
 Measure
 Anova (using car package)

 Hi Mike,

 I tried to run my data in SPSS and it works fine without any problems,
 plug
 in my levels, plug in my covariate (since it is all within) and get my
 Mauchly Tests.

 I tried to rearrange the data so it looks like this

 subj/treatment/day1/day2/day3

 subject    treatment    day1    day2    day3
 1    1    8    8    8
 1    2    5    7    5
 2    1    7    4    4
 2    2    4    5    7
 3    1    8    6    4
 3    2    5    2    4
 4    1    2    9    4
 4    2    1    9    1
 5    1    4    8    1
 5    2    7    8    2
 6    1    4    7    2
 6    2    4    5    2


 When I try mlmfit - lm(Dataset~1), I get invalid type (list) for
 variable
 'Dataset

 When I try

 mod - lm(cbind(day1, day2, day3) ~ Treatment, data=Dataset)

 idata- data.frame(factor(rep(c(Dataset$day1, Dataset$day2,
 Dataset$day3))),
 ordered(Dataset$Treatment))

 Anova(mod, idata=idata, idesign=~Dataset$Treatment)

 I get: Terms in the intra-subject model matrix are not orthogonal.

 When I try is.matrix(Dataset) - I get no.

 My original mock Dataset (attached in txt) is below.  Maybe I am not
 coding
 it right? I would hate to recode all my data for SPSS, since at the end I
 would need to show that Sphericity was not violated.

 Subj  Trtmt   Sessn   Response

 1     N       1       5

 1     D       1       6

 1     N       2       4

 1     D       2       7

 2     N       1       8

 2     D       1       9

 2     N       2       2

 2     D       2       1

 3     N       1       4

 3     D       1       5

 3     N       2       6

 3     D       2       2

 4     N       1       5

 4     D       1       6

 4     N       2       4

 4     D       2       7

 5     N       1       8

 5     D       1       9

 5     N       2       2

 5     D       2       1

 6     N       1       4

 6     D       1       5

 6     N       2       6

 6     D       2       2




 Sincerely,

 Sergios Charntikov (Sergey), MA

 Behavioral Neuropharmacology Lab
 Department of Psychology
 University of Nebraska-Lincoln
 Lincoln, NE 68588-0308  USA



 On Mon, Nov 9, 2009 at 5:29 PM, Mike Lawrence mike.lawre...@dal.ca
 wrote:
 
  No luck as in...? What error did you encounter?
 
  In your example data set, you only have 2 levels of each within-Ss
  factor, in which case you shouldn't expect to obtain tests of
  sphericity; as far as I understand it, sphericity necessarily

[R] Quickly generate all possible combinations of 2 groups

2009-11-09 Thread Mike Lawrence

Hi all,

I suspect the answer to this query will be the tongue-in-cheek use a
quantum computer, but I thought my understanding might be
sufficiently limited that I'm missing a simpler option.

I'm looking for a way to cycle through all possible combinations of 2
groups of data. For 10 data points, that's 2^10 combinations, for 20
data points it's 2^20, etc. combn() from the combinat package is
useful to this purpose, as in:


library(combinat)
n=20 #number of data points in the data set
start=proc.time()[3] #start a timer
pb = txtProgressBar(max=n+1,style=3) #initialize a progress bar
for(i in 0:n){ #for each possible size of set1
Set1 = combn(1:n,i) #get set1 combinations
Set2 = rev(combn(1:n,n-i)) #get set2 combinations
setTxtProgressBar(pb,i+1) #increment the progress bar
}
close(pb) #close the progress bar
proc.time()[3]-start #show the time taken (about 40s @ 2.2GHz)


However, this obviously ends up being too slow when the number of data
points rises much above 20 (I'll likely be dealing with data sets to a
maximum of 200 points).

In case it's relevant, the motivation behind this problem is that I'm
seeking an alternative to EM or simplex methods to obtaining the MLE
of mixture data. Given a mixture model consisting of 2 distributions,
I should be able to obtain an MLE by finding the partitioning of the
data into 2 groups that yields the highest likelihood. I'm
specifically looking at modelling circular data by a mixture of
uniform and a Von Mises centered on zero, so once I have a given
partition, I can estimate parameters of the model (proportion of
points drawn from the Von Mises and concentration of the Von Mises)
analytically and compute the likelihood of the data given that pair of
parameters. I've coded a variant of this approach that generates
random partitioning of data, and this seems to to a decent job of
generating something that might be useful as a starting point for a
subsequent EM or simplex search, but I thought I might double check
with the list to see if there's a computationally efficient solution
to the test all combinations scheme.

Cheers,

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)

2009-11-09 Thread Mike Lawrence

Have you tried ezANOVA from the ez pacakge? It attempts to provide a
simple user interface to car's ANOVA (and when that fails, aov).

On Mon, Nov 9, 2009 at 1:44 PM, Sergios (Sergey) Charntikov
sergios...@gmail.com wrote:
 Hello everyone,

 I am trying to do within subjects repeated measures anova followed by the
 test of sphericity (sample dataset below).
 I am able to get either mixed model or linear model anova and TukeyHSD, but
 have no luck with Repeated-Measures Assuming Sphericity or Separate
 Sphericity Tests.
 I am trying to follow example from car package, but it seems that I am not
 getting something right.

 Dataset$Sessn - as.factor(Dataset$Sessn)

 LinearModel.1 - lm(Response ~ Sessn*Trtmt, data=Dataset)

 summary(LinearModel.1)

 All, good so far, but I have problem understanding idata= and idesign=
 functions pertaining to my example.  Session is my repeated measure (Sessn 1
 and Sessn 2 = two sessions, in reality I have more) and it is already
 stacked. Any help or guidance on this matter.

 Thank you, my mock dataset is below.  Each subject has two levels of
 treatment throughout four calendar days which are recoded to Session 1 and
 Session 2 in order to compare treatments by the first and subsequent days of
 exposure (Treatment x Session; my DV is Response; Session is repeated).

    Subj Trtmt Sessn Response  1 N 1 5  1 D 1 6  1 N 2 4  1 D 2 7  2 N 1 8
 2 D 1 9  2 N 2 2  2 D 2 1  3 N 1 4  3 D 1 5  3 N 2 6  3 D 2 2  4 N 1 5  4 D
 1 6  4 N 2 4  4 D 2 7  5 N 1 8  5 D 1 9  5 N 2 2  5 D 2 1  6 N 1 4  6 D 1 5
 6 N 2 6  6 D 2 2

 Sincerely,

 Sergios Charntikov (Sergey), MA

 Behavioral Neuropharmacology Lab
 Department of Psychology
 University of Nebraska-Lincoln
 Lincoln, NE 68588-0308  USA

 sergioschr-at-gmail-dot-com

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)

2009-11-09 Thread Mike Lawrence

No luck as in...? What error did you encounter?

In your example data set, you only have 2 levels of each within-Ss
factor, in which case you shouldn't expect to obtain tests of
sphericity; as far as I understand it, sphericity necessarily holds
when for repeated measures with only 2 levels and tests are really
only possible for repeated measures with 3 or more levels.

I think it's analogous to how you don't need to test homogeneity of
variance when performing a paired t-test; the test ends up
representing the pairs as single distribution of difference scores
with a single variance.

Mike

On Mon, Nov 9, 2009 at 5:30 PM, Sergios (Sergey) Charntikov
sergios...@gmail.com wrote:
 Tried EZanova, no luck with my particular dataset.


 Sincerely,

 Sergios Charntikov (Sergey), MA

 Behavioral Neuropharmacology Lab
 Department of Psychology
 University of Nebraska-Lincoln
 Lincoln, NE 68588-0308  USA




 On Mon, Nov 9, 2009 at 2:25 PM, Mike Lawrence mike.lawre...@dal.ca wrote:

 Have you tried ezANOVA from the ez pacakge? It attempts to provide a
 simple user interface to car's ANOVA (and when that fails, aov).

 On Mon, Nov 9, 2009 at 1:44 PM, Sergios (Sergey) Charntikov
 sergios...@gmail.com wrote:
  Hello everyone,
 
  I am trying to do within subjects repeated measures anova followed by
  the
  test of sphericity (sample dataset below).
  I am able to get either mixed model or linear model anova and TukeyHSD,
  but
  have no luck with Repeated-Measures Assuming Sphericity or Separate
  Sphericity Tests.
  I am trying to follow example from car package, but it seems that I am
  not
  getting something right.
 
  Dataset$Sessn - as.factor(Dataset$Sessn)
 
  LinearModel.1 - lm(Response ~ Sessn*Trtmt, data=Dataset)
 
  summary(LinearModel.1)
 
  All, good so far, but I have problem understanding idata= and
  idesign=
  functions pertaining to my example.  Session is my repeated measure
  (Sessn 1
  and Sessn 2 = two sessions, in reality I have more) and it is already
  stacked. Any help or guidance on this matter.
 
  Thank you, my mock dataset is below.  Each subject has two levels of
  treatment throughout four calendar days which are recoded to Session 1
  and
  Session 2 in order to compare treatments by the first and subsequent
  days of
  exposure (Treatment x Session; my DV is Response; Session is repeated).
 
     Subj Trtmt Sessn Response  1 N 1 5  1 D 1 6  1 N 2 4  1 D 2 7  2 N 1
  8
  2 D 1 9  2 N 2 2  2 D 2 1  3 N 1 4  3 D 1 5  3 N 2 6  3 D 2 2  4 N 1 5
   4 D
  1 6  4 N 2 4  4 D 2 7  5 N 1 8  5 D 1 9  5 N 2 2  5 D 2 1  6 N 1 4  6 D
  1 5
  6 N 2 6  6 D 2 2
 
  Sincerely,
 
  Sergios Charntikov (Sergey), MA
 
  Behavioral Neuropharmacology Lab
  Department of Psychology
  University of Nebraska-Lincoln
  Lincoln, NE 68588-0308  USA
 
  sergioschr-at-gmail-dot-com
 
         [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tr.im/mikes_public_calendar

 ~ Certainty is folly... I think. ~





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] EM algorithm to fit circular mix of uniform+Von Mises

2009-11-07 Thread Mike Lawrence

Hi all,

I'm curious if anyone has coded an Expectation-Maximization algorithm
that could help me model some circular data I have. I'd like to model
it as a mixture of uniform and Von Mises centered on 0, so the only
free parameters is the mixing proportion and the kappa of the Von
Mises. I couldn't find anything in the contributed packages that
seemed to suit this purpose. Any pointers would be greatly
appreciated!

Cheers,

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating a specific skewed distribution

2009-06-10 Thread Mike Lawrence

With skewed unimodal distributions, the mode can't equal the mean, so
assuming you want the mean to be around 30, I find that a weibull
function can get close to what you want:

 mean(rweibull(1e5,1.5,33))
[1] 29.77781
 pweibull(60,1.5,33)
[1] 0.9138475

I'm sure you can play with the parameters to try to get even closer to
what you want.

On Wed, Jun 10, 2009 at 2:48 AM, David Arnolddwarnol...@suddenlink.net wrote:
 All,
 Can someone help me create a skewed distribution, mean = 30, with
 probability of selecting a random number from the distribution greater
 than or equal 60 equal to 10%?

 I need the probability density function to equal zero at zero, and
 have  a maximum height at or near 30.

 Is this possible?

 And if it is possible, how can I adjust the distribution so that the
 probability of selecting a random number greater than or equal to 60
 is p.

 Thanks. No idea how to start.

 David



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graphically representing frequency of words in a speech?

2009-06-07 Thread Mike Lawrence

Below are various attempts using using ggplot2
(http://had.co.nz/ggplot2/). First I try random positioning, then
random positioning with alpha, then a quasi-random position scheme in
polar coordinates:

#this demo has random number generation
# so best to set a seed to make it
# reproducible.
set.seed(1)

#generate some fake data
a = data.frame(
word = month.name
, freq = sample(1:10,12,replace=TRUE)
)

#add arbitrary location information
a$x = sample(1:12,12)
a$y = sample(1:12,12)

#load ggplot2
library(ggplot2)

#initialize a ggplot object
my_plot = ggplot()

#create an object for the text layer
my_text = geom_text(
data = a
, aes(
x = x
, y = y
, label = word
, size = freq
)
)

#create an object for the text size limits
my_size_scale = scale_size(
to = c(3,20)
)

#create an object to expand the x-axis limits
# (ensures that text isn't cropped)
my_x_scale = scale_x_continuous(
expand = c(.5, 0)
)

#ditto for the y axis
my_y_scale = scale_y_continuous(
expand = c(.5, 0)
)

#create an opts object that removes
# plot elements unnecessary in a tag cloud
my_opts = opts(
legend.position = 'none'
, panel.grid.minor = theme_blank()
, panel.grid.major = theme_blank()
, panel.background = theme_blank()
, axis.line = theme_blank()
, axis.text.x = theme_blank()
, axis.text.y = theme_blank()
, axis.ticks = theme_blank()
, axis.title.x = theme_blank()
, axis.title.y = theme_blank()
)

#show the plot
print(
my_plot+
my_text+
my_size_scale+
my_x_scale+
my_y_scale+
my_opts
)

#to aid readability amidst overlap, set alpha in
# the call to geom_text
my_text_with_alpha = geom_text(
data = a
, aes(
x = x
, y = y
, label = word
, size = freq
)
, alpha = .5
)

#show the version with alpha
print(
my_plot+
my_text_with_alpha+
my_size_scale+
my_x_scale+
my_y_scale+
my_opts
)

#alternatively, in polar coordinates,
# which maps x to angle and y to radius,
# making a nice circle
print(
my_plot+
my_text_with_alpha+
my_size_scale+
my_opts+
coord_polar()
)
#(note omission of my_y_scale 
# my_x_scale, which seem to be ignored
# when coord_polar() is called. I'll
# report this possible bug to the ggplot2
# maintainer)

#a possible way to avoid overlap is to
# map radius (y) to frequency so that
# larger text is in the periphery
# where there is more room. This
# necessitates adding some random
# noise to the frequency so that
# the low frequency words don't
# jumble in the center too badly
a$freq2 = a$freq+rnorm(12)

#now map radius (y) to freq2
my_text_with_alpha_and_freq2 = geom_text(
data = a
, aes(
x = x
, y = freq2
, label = word
, size = freq
)
, alpha = .5
)

#show the version with alpha  radius mapped to freq2
print(
my_plot+
my_text_with_alpha_and_freq2+
my_size_scale+
my_opts+
coord_polar()
)

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting a Weibull Distribution

2009-06-05 Thread Mike Lawrence

http://lmgtfy.com/?q=fit+weibull+distribution+R

Hit #8:
http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf


On Fri, Jun 5, 2009 at 7:21 PM, r...@rstatx.com wrote:

   How do you fit a Weibull distribution in R?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Anyone using Bespin R?

2009-06-01 Thread Mike Lawrence

Just curious whether anyone has any experiences to relate with regards
to using Bespin (https://bespin.mozilla.com/) as a text editor for
your R scripting.

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Within Subject ANOVA question

2009-06-01 Thread Mike Lawrence

Although you factorized subject and condition when you created them as
separate objects, this didn't survive cbind() then as.data.frame(),
thus

 example-data.frame(cbind(subject,condition,recall))
 str(example)
'data.frame':   30 obs. of  3 variables:
 $ subject  : int  1 2 3 4 5 6 7 8 9 10 ...
 $ condition: int  1 1 1 1 1 1 1 1 1 1 ...
 $ recall   : int  6 3 7 16 12 11 1 8 5 4 ...

Use this alternative construction (note that I omit factorization of
the response variable, recall, because this shouldn't be a factor)

 subject-factor(c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10))
 condition-factor(c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5))
 recall-c(10,6,11,22,16,15,1,12,9,8,13,8,14,23,18,17,1,15,12,9,13,8,14,25,20,17,4,17,12,12)
 example-data.frame(subject,condition,recall)
 str(example)
'data.frame':   30 obs. of  3 variables:
 $ subject  : Factor w/ 10 levels 1,2,3,4,..: 1 2 3 4 5 6 7 8 9 10 ...
 $ condition: Factor w/ 3 levels 1,2,5: 1 1 1 1 1 1 1 1 1 1 ...
 $ recall   : num  10 6 11 22 16 15 1 12 9 8 ...
 aov.recall - aov(recall~condition + Error(subject/condition),data=example)
 summary(aov.recall)

Error: subject
  Df Sum Sq Mean Sq F value Pr(F)
Residuals  9 942.53  104.73

Error: subject:condition
  Df Sum Sq Mean Sq F valuePr(F)
condition  2 52.267  26.133  42.506 1.519e-07 ***
Residuals 18 11.067   0.615
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


On Mon, Jun 1, 2009 at 7:31 PM, tsunhin wong thjw...@gmail.com wrote:
 Dear R users,

 I have copied for following table from an article on Using confidence
 intervals in within-subject designs:

 Subject 1sec 2sec 5sec
 1 10 13 13 12.00
 2 6 8 8 7.33
 3 11 14 14 13.00
 4 22 23 25 23.33
 5 16 18 20 18.00
 6 15 17 17 16.33
 7 1 1 4 2.00
 8 12 15 17 14.67
 9 9 12 12 11.00
 10 8 9 12 9.67

 I rearranged the data this way:
subject-factor(c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10))
condition-factor(c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,5,5,5,5,5,5,5,5,5,5))
recall-factor(c(10,6,11,22,16,15,1,12,9,8,13,8,14,23,18,17,1,15,12,9,13,8,14,25,20,17,4,17,12,12))
example-cbind(subject,condition,recall)

 Using ANOVA (aov), I should have DF for condition = 2, DF of subjects
 = 9, Interaction DF = 18
 And a term for mean square of interaction. (0.61)

 But, I have something like below instead:
aov.recall - aov(recall~condition + 
Error(subject/condition),data=as.data.frame(example))
summary(aov.recall)

 Error: subject
          Df Sum Sq Mean Sq F value Pr(F)
 Residuals  1 12.898  12.898

 Error: subject:condition
          Df Sum Sq Mean Sq
 condition  1 34.505  34.505

 Error: Within
          Df Sum Sq Mean Sq F value Pr(F)
 condition  1   3.22    3.22  0.1378 0.7135
 Residuals 26 607.54   23.37

 The within-subject (repeated measure) anova of R seems to be a bit
 subtle for me, please point out the errors that I have made if you
 can.
 Thank you very much!

 - John

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 legend

2009-05-28 Thread Mike Lawrence

, 12L, 12L, 13L, 14L,
    12L, 13L, 14L, 15L, 16L, 15L, 14L, 15L, 16L, 16L, 15L, 16L,
    17L, 16L, 17L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L,
    2L, 3L, 4L, 5L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 5L, 6L, 7L, 8L,
    8L, 9L, 8L, 8L, 9L, 9L, 8L, 9L, 10L, 11L, 10L, 10L, 11L,
    12L, 11L, 12L, 12L, 13L, 14L, 12L, 13L, 14L, 15L, 16L, 15L,
    14L, 15L, 16L, 16L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
    2L, 2L, 3L, 4L, 5L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 5L, 6L, 7L,
    8L, 8L, 9L, 8L, 8L, 9L, 9L, 8L, 9L, 10L, 11L, 10L, 10L, 11L,
    12L, 11L, 12L, 12L, 13L, 14L, 12L, 13L, 14L, 15L, 16L, 15L,
    14L, 15L, 16L, 16L, 15L, 16L, 17L, 16L, 17L, 16L, 17L, 18L,
    19L, 17L, 18L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L,
    2L, 3L, 4L, 5L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 5L, 6L, 7L, 8L,
    8L, 9L, 8L, 8L, 9L, 9L, 8L, 9L, 10L, 11L, 10L, 10L, 11L,
    12L, 11L, 12L, 12L, 13L, 14L, 12L, 13L, 14L, 15L, 16L, 15L,
    14L, 15L, 16L, 16L, 15L, 16L, 17L, 16L, 17L, 16L, 17L, 18L,
    19L, 17L, 18L)), .Names = c(SampleDate, PondName, Muestreo,
 BodyWeight.g., Length.mm.), class = data.frame, row.names = c(1,
 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,
 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,
 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68,
 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,
 101, 102, 103, 104, 105, 106, 107, 108, 109,
 110, 111, 112, 113, 114, 115, 116, 117, 118,
 119, 120, 121, 122, 123, 124, 125, 126, 127,
 128, 129, 130, 131, 132, 133, 134, 135, 136,
 137, 138, 139, 140, 141, 142, 143, 144, 145,
 146, 147, 148, 149, 150, 151, 152, 153, 154,
 155, 156, 157, 158, 159, 160, 161, 162, 163,
 164, 165, 166, 167, 168, 169, 170, 171, 172,
 173, 174, 175, 176, 177, 178, 179, 180, 181,
 182, 183, 184, 185, 186, 187, 188, 189, 190,
 191, 192, 193, 194, 195, 196, 197, 198, 199,
 200, 201, 202, 203, 204, 205, 206, 207, 208,
 209, 210, 211, 212, 213, 214, 215, 216, 217,
 218, 219, 220, 221, 222, 223, 224, 225, 226,
 227, 228, 229, 230, 231, 232, 233, 234, 235,
 236, 237, 238, 239, 240, 241, 242, 243, 244,
 245, 246, 247, 248, 249, 250, 251, 252, 253,
 254, 255, 256, 257, 258, 259, 260, 261, 262,
 263, 264, 265, 266, 267, 268, 269, 270, 271,
 272, 273, 274, 275, 276, 277, 278, 279, 280,
 281, 282, 283, 284, 285, 286, 287, 288, 289,
 290, 291, 292, 293, 294, 295, 296, 297, 298,
 299, 300, 301, 302, 303, 304, 305, 306, 307,
 308, 309, 310, 311, 312, 313, 314, 315, 316,
 317, 318, 319, 320, 321, 322, 323, 324, 325,
 326, 327, 328, 329, 330, 331, 332, 333, 334,
 335, 336, 337, 338, 339, 340, 341, 342, 343,
 344, 345, 346, 347, 348, 349, 350, 351, 352,
 353, 354, 355, 356, 357, 358, 359, 360, 361,
 362, 363, 364, 365, 366, 367, 368, 369, 370,
 371, 372, 373, 374, 375, 376, 377, 378, 379,
 380, 381, 382, 383, 384, 385, 386, 387, 388,
 389, 390, 391, 392, 393, 394, 395, 396, 397,
 398, 399, 400, 401, 402, 403, 404, 405, 406,
 407, 408, 409, 410, 411, 412, 413, 414, 415,
 416, 417, 418, 419, 420, 421, 422, 423, 424,
 425, 426, 427, 428))
 library(ggplot2)
 fishplot -  
 qplot(PondName,BodyWeight.g.,data=fish_ByMuestreo,colour=Muestreo,position=jitter)
  +
  stat_summary(aes(group=Muestreo),fun.data=mean_cl_normal,colour=green,geom=smooth,fill=NA)
  +
 opts(title=Average weight(grs) by Pond)
 print(fishplot)


 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple ANOVA tests

2009-05-27 Thread Mike Lawrence

#create some data
y=rnorm(20)
x=factor(rep(c('A','B'),each=10))

#run the anova
my_aov = aov(y~x)

#summarize the anova
my_aov_summary = summary(my_aov)

#show the anova summary
print(my_aov_summary)

#lets see what's in the summary object
str(my_aov_summary)

#looks like it's a list with 1 element which
#in turn is a data frame with columns.
#The Pr(F) column looks like what we want
my_aov_summary[[1]]$P

#yup, that's it. Grab the first value
p = my_aov_summary[[1]]$P[1]


On Wed, May 27, 2009 at 7:11 AM, Imri bisr...@agri.huji.ac.il wrote:

 Hi all -
 I'm trying to do multiple one-way ANOVA tests of different factors on the
 same variable. As a result I have a list with all the ANOVA tables, for
 exemple:

 $X11_20502
 Analysis of Variance Table

 Response: MPH
           Df  Sum Sq Mean Sq F value    Pr(F)
 x           3   369.9   123.3   6.475 0.0002547 ***
 Residuals 635 12093.2    19.0
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 $X11_21067
 Analysis of Variance Table

 Response: MPH
           Df  Sum Sq Mean Sq F value Pr(F)
 x           1    26.7    26.7  1.3662 0.2429
 Residuals 637 12436.4    19.5

 $X11_10419
 Analysis of Variance Table

 Response: MPH
           Df  Sum Sq Mean Sq F value    Pr(F)
 x           3   527.8   175.9   9.361 4.621e-06 ***
 Residuals 635 11935.3    18.8
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 My question is how can I extract from this list, just the Pr(F) values for
 each x ?
 --
 View this message in context: 
 http://www.nabble.com/Multiple-ANOVA-tests-tp23739615p23739615.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple ANOVA tests

2009-05-27 Thread Mike Lawrence

you could use ldply from the plyr package:

p = ldply(q,function(x){x$P})

Without you data I can't confirm that works, but something like that
should do it

On Wed, May 27, 2009 at 9:23 AM, Imri bisr...@agri.huji.ac.il wrote:

 Thanks for the answer!!!
 I Know how to extract the Pr(F) value from single ANOVA table, but I have a
 list of many ANOVA tables recived by :
 a-function(x)(aov(MPH~x))
 q-apply(assoc[,18:20],2,a) # just for example, I have more than 3
 factors(x)

 print(q)
 $X11_20502
             Df  Sum Sq Mean Sq F value    Pr(F)
 x             3   369.9   123.3   6.475 0.0002547 ***
 Residuals   635 12093.2    19.0
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 246 observations deleted due to missingness

 $X11_21067
             Df  Sum Sq Mean Sq F value Pr(F)
 x             1    26.7    26.7  1.3662 0.2429
 Residuals   637 12436.4    19.5
 246 observations deleted due to missingness

 $X11_10419
             Df  Sum Sq Mean Sq F value    Pr(F)
 x             3   527.8   175.9   9.361 4.621e-06 ***
 Residuals   635 11935.3    18.8
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 246 observations deleted due to missingness

 summary(q)
          Length Class       Mode
 X11_20502 1      summary.aov list
 X11_21067 1      summary.aov list
 X11_10419 1      summary.aov list
  How can I extract all the Pr(F) values from q (not one by one)?

 Thanks
 Imri



 Mike Lawrence wrote:

 #create some data
 y=rnorm(20)
 x=factor(rep(c('A','B'),each=10))

 #run the anova
 my_aov = aov(y~x)

 #summarize the anova
 my_aov_summary = summary(my_aov)

 #show the anova summary
 print(my_aov_summary)

 #lets see what's in the summary object
 str(my_aov_summary)

 #looks like it's a list with 1 element which
 #in turn is a data frame with columns.
 #The Pr(F) column looks like what we want
 my_aov_summary[[1]]$P

 #yup, that's it. Grab the first value
 p = my_aov_summary[[1]]$P[1]


 On Wed, May 27, 2009 at 7:11 AM, Imri bisr...@agri.huji.ac.il wrote:

 Hi all -
 I'm trying to do multiple one-way ANOVA tests of different factors on the
 same variable. As a result I have a list with all the ANOVA tables, for
 exemple:

 $X11_20502
 Analysis of Variance Table

 Response: MPH
           Df  Sum Sq Mean Sq F value    Pr(F)
 x           3   369.9   123.3   6.475 0.0002547 ***
 Residuals 635 12093.2    19.0
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 $X11_21067
 Analysis of Variance Table

 Response: MPH
           Df  Sum Sq Mean Sq F value Pr(F)
 x           1    26.7    26.7  1.3662 0.2429
 Residuals 637 12436.4    19.5

 $X11_10419
 Analysis of Variance Table

 Response: MPH
           Df  Sum Sq Mean Sq F value    Pr(F)
 x           3   527.8   175.9   9.361 4.621e-06 ***
 Residuals 635 11935.3    18.8
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 My question is how can I extract from this list, just the Pr(F) values
 for
 each x ?
 --
 View this message in context:
 http://www.nabble.com/Multiple-ANOVA-tests-tp23739615p23739615.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tr.im/mikes_public_calendar

 ~ Certainty is folly... I think. ~

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://www.nabble.com/Multiple-ANOVA-tests-tp23739615p23741437.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Labeling barplot bars by multiple factors

2009-05-27 Thread Mike Lawrence

You can get something close with ggplot2:

library(ggplot2)
my_data = expand.grid(
A = factor(c('A1','A2'))
, B = factor(c('B1','B2'))
, C = factor(c('C1','C2'))
)
my_data$DV = rnorm(8,mean=10,sd=1)
p = ggplot()
p = p + layer(
geom = 'bar'
, stat = 'identity'
, data = my_data
, mapping = aes(
x = C
, y = DV
, fill = B
)
, position = 'dodge'
)
p = p + facet_grid(
A ~ .
)
p = p + coord_flip()
print(p)


On Wed, May 27, 2009 at 1:01 PM, Thomas Levine thomas.lev...@gmail.com wrote:
 I want to plot quantitative data as a function of three two-level factors.
 How do I group the bars on a barplot by level through labeling and spacing?
 Here http://www.thomaslevine.org/sample_multiple-factor_barplot.png's what
 I'm thinking of. Also, I'm pretty sure that I want a barplot, but there may
 be something better.

 Tom

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help wth boxplots

2009-05-26 Thread Mike Lawrence

check out ggplot2:
http://had.co.nz/ggplot2/

particularly:
http://had.co.nz/ggplot2/geom_boxplot.html

On Tue, May 26, 2009 at 10:34 AM, Amit Patel amitrh...@yahoo.co.uk wrote:

 Hi

 I have a vector of data lets call zz (40 values from 4 samples)
 the data is already in groups, i can even split up the samples using

 SampA - zz[,2:11]
 SampB - zz[,12:21]
 SampC - zz[,22:31]
 SampV - zz[,32:41]

 I would like an output that gives me 4 boxplots on one plot
 one boxplot for the set of 10 values

 how can i do this in R



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help me...!!!

2009-05-26 Thread Mike Lawrence

http://www.r-project.org/posting-guide.html

Basic statistics and classroom homework:  R-help is not intended for these. 

On Tue, May 26, 2009 at 11:37 AM, abel1682 lizard_1...@yahoo.it wrote:

 Hi to all...i'm a new R'user and i have to solve some exercies so i ask to
 tou for an help...

 1.) How i can demonstrate  in R that  the limit for x--infinite of
 (1+1/x)^x is equal to e?
 2.) if i have a vector of values how can i create a function that, applied
 to my vector, give me median, mean, Var and length togheter?
 3.)Find the minimum of this function:
                                                    f(x)=(x-3)^4
 with the Newton method.

 4.) Define a function that is able to calculate the geometric mean of a
 seriation:

 Sorry for all these questions...
 Thanks a lot!!!...
 --
 View this message in context: 
 http://www.nabble.com/Help-me...%21%21%21-tp23724167p23724167.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to create all pairs

2009-05-25 Thread Mike Lawrence

expand.grid(i,j)

On Mon, May 25, 2009 at 8:26 PM, alad abhimanyu...@gmail.com wrote:

 Hi,

 I have:
 i = c(1,2,3)
 j = c(4,5,6)

 How do I create a matrix of all pairs?
 i.e.
 1,4
 1,5
 1,6
 2,4
 :

 Thanks!


 --
 View this message in context: 
 http://www.nabble.com/How-to-create-all-pairs-tp23714659p23714659.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with replacing factors

2009-05-24 Thread Mike Lawrence

This should work:

levels(black_gray$gray)[levels(black_gray$gray)=='gray20'] = 'blue'

On Sun, May 24, 2009 at 8:15 AM, Andreas Christoffersen
achristoffer...@gmail.com wrote:
 Hi,

 In the example dataset below - how can I cahnge gray20, to blue

 # data
 black - rep(c(black,red),10)
 gray - rep(c(gray10,gray20),10)
 black_gray - data.frame(black,gray)

 # none of this desperate things works
 # replace(black_gray$gray, gray==gray20,red)
 # if(black_gray$gray==gray20){black_gray$gray-blue}
 # for (i in
 black_gray$gray)if(black_gray$gray[i]==gray20){black_gray$gray[i]
 -blue}
 # black_gray$gray==gray14 - blue
 # black_gray$gray[gray==gray20] - blue
 # subset(black_gray,gray==gray20,gray) -rep(blue,10)

 I have a feeling this is me misunderstanding some very basic stuf about the
 R engine... So any help will be much appreciated.

 Thanks in advance

 Andreas

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coord_equal in ggplot2

2009-05-19 Thread Mike Lawrence

If you use coord_equal on data where the range on the x-axis is larger
than the range on the y-axis, then of course you'll observe extra
space on the y-axis. What did you expect?

Also, this post may be better suited to the ggplot2 mailing list:
http://had.co.nz/ggplot2/

On Tue, May 19, 2009 at 7:17 AM, ONKELINX, Thierry
thierry.onkel...@inbo.be wrote:
 Dear all,

 I'm plotting some points on a graph where both axes need to have the
 same scale. See the example below. Coord_equal does that trick but in
 this case it wastes a lot of space on the y-axis. Setting the limits of
 the y-axis myself was no avail.

 Any suggestions to solve this problem?

 library(ggplot2)
 ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000,
 min = 14, max = 26))
 ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal()
 ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() +
 scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits =
 c(14, 26))

 Regards,

 Thierry

 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature
 and Forest
 Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
 methodology and quality assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be

 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of.
 ~ Sir Ronald Aylmer Fisher

 The plural of anecdote is not data.
 ~ Roger Brinner

 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data.
 ~ John Tukey


 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
 en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd 
 is
 door een geldig ondertekend document. The views expressed in  this message
 and any annex are purely those of the writer and may not be regarded as 
 stating
 an official position of INBO, as long as the message is not confirmed by a 
 duly
 signed document.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MAC OSX vs Win XP: Different stats test results!

2009-05-18 Thread Mike Lawrence

If mulrank does any sort of random number generation or non-exhaustive
randomization, you should set the seed of the random number generator
first:

set.seed(1)
mulrank(3,6,data03$x)


On Mon, May 18, 2009 at 7:37 AM, Mareen mareenwe...@yahoo.com wrote:

 Hi all,
 I wondered whether anyone has some advice on a stats-related 'sanity check',
 as I ran a nonparametric multivariate test (mulrank function as decribed by
 R. Wilcox, 2005) on both systems, but got different results (please see
 below for the system-specific outputs)! The functions I used are attached as
 well. Any advice would be much appreciated! Thanks in advance for getting
 back to me!

 Best wishes,
 Mareen
 
 Mac:

 data03-selby2(data02, c(1,2), 3)
 mulrank(3,6,data03$x)
 $test.stat
 [1] 0.9331133

 $nu1
 [1] 11.46300

 $p.value
         [,1]
 [1,] 0.509296

 $N
 [1] 233

 $q.hat
          [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
 [1,] 0.4940071 0.5256726 0.5176384 0.5476290 0.4690935 0.5265100
 [2,] 0.5170627 0.4791950 0.5026431 0.4867843 0.4778865 0.5033497
 [3,] 0.4680729 0.4944258 0.4889563 0.4505391 0.5311420 0.4726002

 Win:
 mulrank(3,6, data03$x)
 $test.stat
 [1] 1.114665

 $nu1
 [1] 8.155991

 $p.value
          [,1]
 [1,] 0.3491221

 $N
 [1] 233

 $q.hat
          [,1]      [,2]      [,3]      [,4]      [,5]      [,6]
 [1,] 0.4940071 0.5406417 0.5236837 0.5656338 0.4771799 0.5324505
 [2,] 0.5162776 0.4801895 0.5022244 0.4960745 0.4854234 0.4820737
 [3,] 0.5013608 0.4920967 0.4810269 0.4482885 0.5326861 0.4871506

 http://www.nabble.com/file/p23595008/Rallfun-v92.txt Rallfun-v92.txt

 --
 View this message in context: 
 http://www.nabble.com/MAC-OSX-vs-Win-XP%3A-Different-stats-test-results%21-tp23595008p23595008.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sample variance from simulation

2009-05-18 Thread Mike Lawrence

why not simply

vars=list()
for (i in 1:1000) vars[[i]] = var(z[[i]])


On Mon, May 18, 2009 at 6:51 AM, Kon Knafelman konk2...@hotmail.com wrote:

 Hi,

 g=list()
 for(i in 1:1000){z[[i]]=rnorm(15,0,1)}

 I've attempted a similar problem based on the above method. Now, if i want to 
 find the sample variance, do i go about it like this?

 for (i in 1:1000)vars[[i]] = sum(z[[i]])
 vars[[i]]

 the overall sigma squared will just be 1, because the distribution is 
 standard normal. Is this correct?

 if so, then to find (n-1)S^2/σ^2,

 i will need s=999*sum(vars[[i]]))/1?

 Is this correct, or am i getting lost along the way?

 Thank you
 Date: Wed, 13 May 2009 16:45:22 +0100
 From: b.rowling...@lancaster.ac.uk
 To: csa...@rmki.kfki.hu
 CC: r-help@r-project.org
 Subject: Re: [R] Simulation

 On Wed, May 13, 2009 at 4:26 PM, Gábor Csárdi csa...@rmki.kfki.hu wrote:
  On Wed, May 13, 2009 at 5:13 PM, Debbie Zhang debbie0...@hotmail.com 
  wrote:
 
 
  Dear R users,
 
  Can anyone please tell me how to generate a large number of samples in R, 
  given certain distribution and size.
 
  For example, if I want to generate 1000 samples of size n=100, with a 
  N(0,1) distribution, how should I proceed?
 
  (Since I dont want to do rnorm(100,0,1) in R for 1000 times)
 
  Why not? It took 0.05 seconds on my 5 years old laptop.

  Second-guessing the user, I think she maybe doesn't want to type in
 'rnorm(100,0,1)' 1000 times...

  Soln - for loop:

   z=list()
   for(i in 1:1000){z[[i]]=rnorm(100,0,1)}

 now inspect the individual bits:

   hist(z[[1]])
   hist(z[[545]])

 If that's the problem, then I suggest she reads an introduction to R...

 Barry

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 _
 Looking to change your car this year? Find car news, reviews and more
 http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fsecure%2Dau%2Eimrworldwide%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sample variance from simulation

2009-05-18 Thread Mike Lawrence

Ah, I thought this smelled like homework...

Please read the R-help mailing list posting guide
(http://www.r-project.org/posting-guide.html), specifically:

Basic statistics and classroom homework:  R-help is not intended for these.

On Mon, May 18, 2009 at 10:35 AM, Kon Knafelman konk2...@hotmail.com wrote:
 Hey,

 when i type in either of those formulas into R, i dont really get the answer
 im looking for. For such large samples, isnt the sample variance meant to
 approach the actual variance, which is 1 for a standard normal?

 also, when i use sapply, i 1000 results for variance, where i think i just
 need one number.

 I've worked on this problem for so long. The initial problem is as follows

 Use the simulation capacity of R to generate m = 1 000
 samples of size n = 15 from a N(0,1) distribution. Compute the statistic
 (n-1)S^2/σ^2 for the normally generated values, labelling as NC14. Produce
 probability histogram for NC14 and superimpose the theoretical distribution
 for a χ2 (14 degrees of freedom)

 g=list()
 for(i in 1:1000){z[[i]]=rnorm(15,0,1)}

 for (i in 1:1000)vars[[i]] = sum(z[[i]])

 vars[[i]]

 sum(var(z[[i]]))

 [1] 0.9983413

 Does this make sense?  my logic is that i use the loop again to add up all
 the individual variances. im not really sure if i did it correctly, but if
 someone could make the necessary corrections, i'd be very very greatful.

 Thanks heaps guys for taking the time to look at this

 Date: Mon, 18 May 2009 15:06:47 +0200
 From: waclaw.marcin.kusnierc...@idi.ntnu.no
 To: konk2...@hotmail.com
 CC: mike.lawre...@dal.ca; r-help@r-project.org
 Subject: Re: [R] sample variance from simulation

 Mike Lawrence wrote:
  why not simply
 
  vars=list()
  for (i in 1:1000) vars[[i]] = var(z[[i]])
 
 

 ... or, much simpler,

 vars = sapply(z, var)

 vQ

 
 Let ninemsn property help Looking to move somewhere new this winter?



-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MAC OSX vs Win XP: Different stats test results!

2009-05-18 Thread Mike Lawrence

A good example of why manipulating data in external programs is
dangerous. See a discussion of this at
http://tolstoy.newcastle.edu.au/R/e6/help/09/05/13697.html

On Mon, May 18, 2009 at 10:56 AM, Mareen Weber mareenwe...@yahoo.com wrote:

 Hello Peter,
 thanks for your quick response - you have hinted into the right direction. I 
 triple-checked my data and - how embarrassing - I actually did have different 
 data matrices. Due to a faulty Win Excel macro that sorted data into groups 
 after data collection, I ended up with different data in my 3 groups in the 
 Win test ... All good now :-). Thanks again!

 Best wishes,
 Mareen





 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read multiple large files into one dataframe

2009-05-13 Thread Mike Lawrence

What types of data are in each file? All numbers, or a mix of numbers
and characters? Any missing data or special NA values?

On Wed, May 13, 2009 at 7:45 AM, SYKES, Jennifer
jennifer.sy...@nats.co.uk wrote:
 Hello



 Apologies if this is a simple question, I have searched the help and
 have not managed to work out a solution.

 Does anybody know an efficient method for reading many text files of the
 same format into one table/dataframe?



 I have around 90 files that contain continuous data over 3 months but
 that are split into individual days data and I need the whole 3 months
 in one file for analysis.  Each days file contains a large amount of
 data (approx 30MB each) and so I need a memory efficient method to merge
 all of the files into the one dataframe object.  From what I have read I
 will probably want to avoid using for loops etc?  All files are in the
 same directory, none have a header row, and each contain around 180,000
 rows and the same 25 columns/variables.  Any suggested packages/routines
 would be very useful.



 Thanks



 Jennifer







 -
 ***If
 you are not the intended recipient, please notify our Help Desk at
 Email postmas...@nats.co.uk immediately. You should not copy or use
 this email or attachment(s) for any purpose nor disclose their
 contents to any other person. NATS computer systems may be
 monitored and communications carried on them recorded, to secure
 the effective operation of the system and for other lawful
 purposes. Please note that neither NATS nor the sender accepts any
 responsibility for viruses or any losses caused as a result of
 viruses and it is your responsibility to scan or otherwise check
 this email and any attachments. NATS means NATS (En Route) plc
 (company number: 4129273), NATS (Services) Ltd (company number
 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd
 (company number 3155567) or NATS Holdings Ltd (company number
 4138218). All companies are registered in England and their
 registered office is at 5th Floor, Brettenham House South,
 Lancaster Place, London, WC2E 7EN.
 **

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation

2009-05-13 Thread Mike Lawrence

If you want k samples of size n, why generate k*n samples and put them
in a k-by-n matrix where you can do what you want to each sample:

k = 10
n = 100
x=matrix(rnorm(k*n),k,n)
rowMeans(x)

If you need to do more complex things to each sample and if k is large
enough that you don't want the matrix sitting around in memory while
you do these things, you could also check out ?replicate .

On Wed, May 13, 2009 at 12:13 PM, Debbie Zhang debbie0...@hotmail.com wrote:


 Dear R users,

 Can anyone please tell me how to generate a large number of samples in R, 
 given certain distribution and size.

 For example, if I want to generate 1000 samples of size n=100, with a N(0,1) 
 distribution, how should I proceed?

 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)



 Thanks for help



 Debbie

 _
 Looking to change your car this year? Find car news, reviews and more

 e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Kumaraswamy distribution

2009-05-12 Thread Mike Lawrence

dkumar = function(x,a,b) a*b*(x^(a-1))*((1-x^a)^(b-1))
pkumar = function(x,a,b) 1-(1-x^a)^b

(I've based this entirely on the wikipedia entry on the Kumaraswamy
distribution [http://en.wikipedia.org/wiki/Kumaraswamy_distribution],
so best to check both my replication of the formula there and the
accuracy of the wikipedia entry itself)

On Tue, May 12, 2009 at 6:37 AM, Debbie Zhang debbie0...@hotmail.com wrote:

 Dear R users,

 Does anyone know how to write function for Kumaraswamy distribution in R? 
 Since I cannot write dkumar, pkumar, etc. in R.



 Please help.



 Thanks a lot,

 Debbie

 _
 [[elided Hotmail spam]]

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Two-way Anova

2009-05-12 Thread Mike Lawrence

Using traditional ANOVA, you'd have to drop either cases or time
points with missing data. Using linear mixed effects analysis, you'd
be able to use all the data. LME also has the benefit of *not*
assuming sphericity, which is good for data like yours (many measures
across few cases) where the traditional ANOVA sphericity assumption is
unlikely to hold.

Your dependent variable, % valid, suggests that there's some more raw
representation of the data that may be better to look at. For example,
if % valid is derived from, say, the success/failure rate of 10
observations per sample/timepoint, you might want to take a look the
lme4 package (as suggested in a previous post:
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q3/001160.html )

On Tue, May 12, 2009 at 6:03 AM, Alan O'Loughlin olo...@wyeth.com wrote:
 Hello,

 I'm trying to do a comparsion on a large scale say 10L bottle of liquid and a 
 small scale bottle of liquid 0.5L, I have 5 different samples from each and 
 they are measured over the space of 8 days as % viability and the % viability 
 decreases over time. However not all 10 samples got measured every day. How 
 would I do a two-way anova on this in R?

 Thanks for any help.

 Regards,
 Al

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Do you use R for data manipulation?

2009-05-06 Thread Mike Lawrence

I work in cognitive science where we collect one or more data files
per participant in an experiment then merge those files to perform
subsequent analyses. Sometimes some files are in wide format and
others are in long format, necessitating reshaping. I've found R
entirely satisfactory for this.*

Indeed, I would be wary of an approach that attempts data manipulation
*outside* of R as I'm of the raw data in, results out school of
thought that it's dangerous to isolate your data manipulation from
your record of analysis. If you leave your raw data files untouched
and perform all manipulation  analysis in one system (R) then there
is a complete record of what's happened to the data from start to
finish and it's easier to catch/correct errors.

The reshape package is great for reshaping between long  wide data
formats, and the ply package is great for computing summary statistics
within cells of the design.

Mike

*note: I typically use Python for data collection (showing visual
stimuli, recording responses, etc), but have it spit out raw text
files of the trial-by-trial data, and thus use it for only a bare
minimum of processing.

--
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] General Question about Graphics

2009-05-06 Thread Mike Lawrence

Check out ggplot2:
http://had.co.nz/ggplot2/

Particularly the book:
http://had.co.nz/ggplot2/book

For error bars on bar plots specifically:
http://had.co.nz/ggplot2/geom_errorbar.html

Mike


On Wed, May 6, 2009 at 12:34 PM,  steve_fried...@nps.gov wrote:

 Greetings

 I have encountered a situation with regards to plotting barcharts with
 associated error bars.  My search for clues on how to accomplish this
 turned up some interesting information.  Basically, I found that including
 error bars with barplots is not desirable and hence there appears that
 there is no function to do this.

 Can someone offer suggestions on how to do this simple procedure or offer
 an alternative?

 Thanks
 Steve

 Steve Friedman Ph. D.
 Spatial Statistical Analyst
 Everglades and Dry Tortugas National Park
 950 N Krome Ave (3rd Floor)
 Homestead, Florida 33034

 steve_fried...@nps.gov
 Office (305) 224 - 4282
 Fax     (305) 224 - 4147

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bumps chart in R

2009-05-06 Thread Mike Lawrence

(cross posting to the ggplot2 group for posterity)

Here's how I'd approach it:

library(ggplot2)
text = letters[1:20]
tal1 = rnorm (20,5,2)
tal2 = rnorm (20,6,3)
dif = tal2-tal1
df0 = data.frame(text,tal1,tal2)
df = melt(
data = df0
, id.vars = 'text'
, variable_name = 'tal'
)
df$dif = dif
df$col = ifelse(dif0,'red',ifelse(dif0,'blue','black'))
df$size = abs(dif)

# draw the plot
ggplot(
data=df
, aes(
x=tal
, y=value
, group=text
)
) +
geom_line(
aes(
size=size
, colour=col
)
)+
geom_text(
aes(
label=text
)
)+
opts(
legend.position=none
)+
theme_bw()


Unfortunately it's not perfect:
(1) col isn't being interpreted literally, so instead of truly red  blue.
(2) the lines ends are square whereas usually they're rounded.
(3) attempting to remove the legend via opts(legend.position=none)
seems to fail.

On Wed, May 6, 2009 at 6:44 PM, Andreas Christoffersen
achristoffer...@gmail.com wrote:
 Okay - first off: Thank you all for you kind help so far. (Especially
 to hadly for writing ggplot2). I feel that I am beginning to
 understand the grammar of graphics - but obviously still have a long
 way to go. Some of the road I need to travel has to do with basic R.
 Anyway ; instead of wrting a new post - I thought it best to post in
 the current topic. please correct me if I am wrong.

 getting to the point: I have expanded upon the original bumps chart
 (or what ever type of chart it is). I'd like the line width to be
 proportional to the change between time 1 and time 2. Also: I'd like
 colous to be red if the change is negative, and black if the change is
 positive.

 My solutions produces some problems for me:

 library(ggplot2) # Loads ggplot2
 text - letters[1:20] # letters is proxy for categories
 tal1 - rnorm (20,5,2) # random numbers for first
 tal2 - rnorm (20,6,3) # random numbers for second
 dif - tal2-tal1 # difference between second and first
 df0 - cbind(tal1,tal2,dif) # do dataframe
 df - melt(df0) # melt
 farve - c(2,1,1,2,2,1,2,2,2,2,1,1,1,2,2,1,1,1,2,2) # define colours -
 black for positive change, red for negative change
 # these colours I handcode - depending on the random generated - so it
 will not fit new data.

 # draw the plot
 qplot(X2, value,data=df,  group=X1,geom=blank)+
  geom_line(aes(group=X1),subset(df,X2!=dif),size=scale(abs(subset(df,df$X2==dif)$value),center=F,scale=T)[,1],colour=farve)+
  geom_text(aes(label=X1),subset(df,X2==tal2),size=3,hjust=-3,vjust=1)+theme_bw()

 # My questions:
 # How to do colours automaticaly
 # how to remove dif from the X axis? - subset doesn't seem to work? - eg
 # qplot(subset(df,X2!=dif,X2, drop=T), value,data=df) - returns an error.
 # Hot to use melt better - so that text becomes the id? id=text doesn't work.

 thanks in advance

 On Tue, Apr 28, 2009 at 12:09 AM, Andreas Christoffersen
 achristoffer...@gmail.com wrote:

 My legend is removed! - Couldn't find it in your ggplot2 book - but
 here it is. Brilliant - thank you very much.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot estimates and their 95% confidence intervals

2009-04-27 Thread Mike Lawrence

Take a look at ggplot2:

http://had.co.nz/ggplot2/
http://had.co.nz/ggplot2/book/

Particularly:

http://had.co.nz/ggplot2/geom_point.html
http://had.co.nz/ggplot2/geom_errorbar.html
http://had.co.nz/ggplot2/coord_flip.html

ex:

p = ggplot(my_data,aes(x=state,y=ami_mean))
p = p + geom_point()
p = p + geom_errorbar(aes(ymin=ami_low,ymax=ami_up))
p = p + coord_flip()
print(p)

On Mon, Apr 27, 2009 at 6:00 PM, He, Yulei h...@hcp.med.harvard.edu wrote:
 Hi, there:

 I have a dataset with 50 states and for each state, I have its associated 
 mean estimate (for some parameters) and the lower and upper bound of the 95% 
 CI. The data look like below:

    state ami_mean  ami_low   ami_up
 1      MS -0.58630 -0.90720 -0.29580
 2      KY -0.48100 -0.75990 -0.19470
 3      FL -0.47900 -0.62930 -0.32130

 I would like to have a plot the 95% CI (characterized by the mean, lower, and 
 upper bound, and the lines linking them) for each state, with x-axis is the 
 parameter estimate, and y-axis is the state. What kind of plot functions can 
 I use?

 Thanks.

 Yulei


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bumps chart in R

2009-04-26 Thread Mike Lawrence

Here's a ggplot2 based solution:

#load the ggplot2 library
library(ggplot2)

#here's the data provided by Andreas
countries - c(U-lande, Afrika syd for sahara, Europa og
Centralasien, Lantinamerika og Caribien,Mellemøstenog
Nordafrika,Sydasien,ØStasien og stillehaveet, Kina,
Brasilien)
poor_1990 - c(28.7,46.7,0.5,10.2,2.3,43,29.8,33,14)
poor_2004 - c(18.1,41.1,0.9,8.6,1.5,30.8,9.1,9.9,7.5)

#reformat the data
data = data.frame(countries,poor_1990,poor_2004)
data = melt(data,id=c('countries'),variable_name='year')
levels(data$year) = c('1990','2004')

#make a new column to make the text justification easier
data$hjust = 1-(as.numeric(data$year)-1)

#start the percentage plot
p = ggplot(
data
,aes(
x=year
,y=value
,groups=countries
)
)

#add the axis labels
p = p + labs(
x = '\nYear'
, y = '%\n'
)

#add lines
p = p + geom_line()

#add the text
p = p + geom_text(
aes(
label=countries
, hjust = hjust
)
)

#expand the axis to fit the text
p = p + scale_x_discrete(
expand=c(2,2)
)

#show the plot
print(p)


#rank the countries
data$rank = NA
data$rank[data$year=='1990'] = rank(data$value[data$year=='1990'])
data$rank[data$year=='2004'] = rank(data$value[data$year=='2004'])

#start the rank plot
r = ggplot(
data
,aes(
x=year
,y=rank
,groups=countries
)
)

#add the axis labels
r = r + labs(
x = '\nYear'
, y = 'Rank\n'
)

#add the lines
r = r + geom_line()

#add the text
r = r + geom_text(
aes(
label=countries
, hjust = hjust
)
)

#expand the axis to fit the text
r = r + scale_x_discrete(
expand=c(2,2)
)

#show the plot
print(r)


-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] boxplot of two variables

2009-04-23 Thread Mike Lawrence

Check out ggplot2:

http://had.co.nz/ggplot2

especially:
http://had.co.nz/ggplot2/geom_boxplot.html

But you are strongly advised to read the book:
http://had.co.nz/ggplot2/book/

On Thu, Apr 23, 2009 at 12:13 PM, Gabriel R. Rodriguez
garo...@cnia.inta.gov.ar wrote:
 Hello !



 I have a dataframe with 6 variables (A1,A2,B1,B2,C1,C2) and 1 factor (F).



 I would like to produce a graph consisting of 3 boxplots sets, one for every
 two  variables (i.e A1 A2) by the factor (F).

 I was looking around and I cannot figure it out, any suggestions?



 Best Regards,

 Gabriel




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling in R

2009-04-21 Thread Mike Lawrence

-sort(Perm.Cor, decreasing=TRUE)
 if(mat.cor0) Perm.Cor.Sorted-sort(Perm.Cor, decreasing=FALSE)
 T95-Perm.Cor.Sorted[(nperm+1)*0.05]    # 95% treshold value
 T99-Perm.Cor.Sorted[(nperm+1)*0.01]    # 99% treshold value



 I want to understand where I am making a mistake. Any comment is deeply 
 appreciated.

 Kind Regards

 Seyit Ali


 --
 Dr. Seyit Ali KAYIS
 Selcuk University
 Faculty of Agriculture
 Kampus, Konya, TURKEY

            s_a_ka...@yahoo.com,    s_a_ka...@hotmail.com
 Tell: +90 332 223 2830  Mobile: +90 535 587 1139  Fax: +90 332 241 0108

                   Greetings from Konya, TURKEY
                http://www.ziraat.selcuk.edu.tr/skayis/
 --







 _
 Earning enough? Find out with SEEK Salary Survey

 %2Eco%2Enz%2F%3Ftracking%3Dsk%3Atl%3Asknzsal%3Amsnnz%3A0%3Ahottag%3Aearn%5Fenough_t=757263783_r=Seek_NZ_tagline_m=EXT
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] OT: Strange bootstrap approach to significance testing

2009-04-21 Thread Mike Lawrence

Hi all,

I'm reviewing a paper that employs a strange (to me) approach to a
non-parametric significance testing. I'm familiar with permutation
tests and bootstrapping confidence intervals around parameter
estimates, but here they seem to be bootstrapping CIs for a
manufactured Null F-value. They have a simple 3 groups between-Ss
design and compute the F-value of the observed data. Then, they
re-center each group's mean to zero then repeatedly: randomly resample
values from each group (with replacement, doubling each group's size)
and re-compute the F-value. The distribution of re-centered/resampled
F-values is then used as a reference distribution within which the
percentile of the observed F-value is computed. They determine that
the observed F-value is outside the 95% confidence interval of the
re-centered/resampled F-values and conclude a significant effect of
group.

In my head this makes some sense (though I'd think a straight
permutation test would be a lot simpler), but having never heard of
anything like this before I thought I'd see what others on this list
think of the approach.

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Paired test for 2-way ANOVA

2009-04-20 Thread Mike Lawrence

Paired (aka. Repeated measures, aka. within-Ss) tests can be achieved
by using aov() and specifying the within-Ss effect in the error term:

my_aov = aov( dependent_variable~between_Ss_variable*within_Ss_variable
+ Error(Ss_id/(within_Ss_variable)))

summary(my_aov)

On Mon, Apr 20, 2009 at 10:25 AM, A Ezhil ezhi...@yahoo.com wrote:

 Hi,

 I have an experimental data with 2 factors: visit and treatment. Each subject 
 has 2 visits and on each visit they get a treatment. We take samples before 
 and after treatment. I can easily to do the 2-way ANOVA in R with visit and 
 treatment as factors.

 anova( lm(data ~ visit*treatment) )

 If I want to do the same 2-way ANOVA for the paired samples, how can I do 
 that? Is there any R packages for that?

 Thanks in advance.

 Kind regards,
 Ezhil

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Welcome to the R-help mailing list

2009-04-18 Thread Mike Lawrence

When plotting in loops, you need to wrap your plot call in print().

On Sat, Apr 18, 2009 at 2:14 PM, Qifei Zhu zhu_qi...@yahoo.com.sg wrote:
 Hi all,

 I'm a newbie R developer, am trying to dotplot a few graphs using a for
 loop.

 The following code works fine but once I wanna plot inside a loop, nothing
 happens.
 for(i in 1:1){dotplot(y~x)}
 y - c(1,2,3)
 x - c('a','b','c')
 dotplot(y~x)

 for (i in 1:3) {dotplot(y~x)} (y and x depends on I in actual case)
 Nothing happens.

 I appreciate your advice on what is going wrong? Thanks.

 Best,
 Tony

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] performing function on data frame

2009-04-16 Thread Mike Lawrence

As Michael notes, scale is what you want. Also, your request has an
incorrect definition of z scores:

d$y - mean(d$y)/sd(d$y) #incorrect

(d$y - mean(d$y) ) / sd(d$y) #correct

On Thu, Apr 16, 2009 at 7:40 AM, Michael Conklin
michael.conk...@markettools.com wrote:
 newDF-as.data.frame(scale(oldDF))

 see ?scale

 Hope that helps.

 Michael Conklin


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Karin Lagesen
 Sent: Thursday, April 16, 2009 5:29 AM
 To: r-help@r-project.org
 Subject: Re: [R] performing function on data frame

 David Hajage dhajag...@gmail.com writes:

 Hi Karin,

 I'm not sure I understand... Is this what you want ?

 d$y - mean(d$y)/sd(d$y)




 Yes, and also a bit no.

 Each column in my data frame represents one data set. For every
 element in this data set I want to know the z value for that
 element. I.e: I want to create a new data frame from the old data
 frame, where each element in the new data frame is

 newDF[i,j] = oldDF[i,j] - mean(d[,j]) / sddev(d[,j])

 I could, I think, iterate like this over the data frame, but I keep
 thinking that one of the apply functions should be employed...

 Karin
 --
 Karin Lagesen, Ph.D.
 karin.lage...@medisin.uio.no
 http://folk.uio.no/karinlag

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] F test

2009-04-16 Thread Mike Lawrence

summary(my_lm) will give you t-values, anova(my_lm) will give you
(equivalent) F-values. summary() might be preferred because it also
provides the estimates  SE.

 a=data.frame(dv=rnorm(10),iv1=rnorm(10),iv2=rnorm(10))
 my_lm=lm(dv~iv1*iv2,a)
 summary(my_lm)

Call:
lm(formula = dv ~ iv1 * iv2, data = a)

Residuals:
Min  1Q  Median  3Q Max
-1.8484 -0.2059  0.1627  0.4623  1.0401

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  -0.4864 0.4007  -1.2140.270
iv1   0.8233 0.5538   1.4870.188
iv2   0.2314 0.3863   0.5990.571
iv1:iv2  -0.4110 0.5713  -0.7190.499

Residual standard error: 1.017 on 6 degrees of freedom
Multiple R-squared: 0.3161, Adjusted R-squared: -0.02592
F-statistic: 0.9242 on 3 and 6 DF,  p-value: 0.4842

 anova(my_lm)
Analysis of Variance Table

Response: dv
  Df Sum Sq Mean Sq F value Pr(F)
iv11 1.9149  1.9149  1.8530 0.2223
iv21 0.4156  0.4156  0.4021 0.5494
iv1:iv21 0.5348  0.5348  0.5175 0.4990
Residuals  6 6.2004  1.0334


On Thu, Apr 16, 2009 at 10:35 AM, kayj kjaj...@yahoo.com wrote:

 Hi,


 How can I find the p-value for the F test for the interaction terms in a
 regression linear model lm ?

 I appreciate your help


 --
 View this message in context: 
 http://www.nabble.com/F-test-tp23078122p23078122.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] F test

2009-04-16 Thread Mike Lawrence

I'm new to LME myself, so it would be best for others to advise on this.

On Thu, Apr 16, 2009 at 3:00 PM, Jun Shen jun.shen...@gmail.com wrote:
 Mike,

 I kind of have the same question. What if for a mixed effect model, say
 using lme(), how to specify the interaction effect (between a fixed effect
 and a random effect)? and where to find the result of the interaction?
 Thanks.

 Jun

 On Thu, Apr 16, 2009 at 12:08 PM, Mike Lawrence mike.lawre...@dal.ca
 wrote:

 summary(my_lm) will give you t-values, anova(my_lm) will give you
 (equivalent) F-values. summary() might be preferred because it also
 provides the estimates  SE.

  a=data.frame(dv=rnorm(10),iv1=rnorm(10),iv2=rnorm(10))
  my_lm=lm(dv~iv1*iv2,a)
  summary(my_lm)

 Call:
 lm(formula = dv ~ iv1 * iv2, data = a)

 Residuals:
    Min      1Q  Median      3Q     Max
 -1.8484 -0.2059  0.1627  0.4623  1.0401

 Coefficients:
            Estimate Std. Error t value Pr(|t|)
 (Intercept)  -0.4864     0.4007  -1.214    0.270
 iv1           0.8233     0.5538   1.487    0.188
 iv2           0.2314     0.3863   0.599    0.571
 iv1:iv2      -0.4110     0.5713  -0.719    0.499

 Residual standard error: 1.017 on 6 degrees of freedom
 Multiple R-squared: 0.3161,     Adjusted R-squared: -0.02592
 F-statistic: 0.9242 on 3 and 6 DF,  p-value: 0.4842

  anova(my_lm)
 Analysis of Variance Table

 Response: dv
          Df Sum Sq Mean Sq F value Pr(F)
 iv1        1 1.9149  1.9149  1.8530 0.2223
 iv2        1 0.4156  0.4156  0.4021 0.5494
 iv1:iv2    1 0.5348  0.5348  0.5175 0.4990
 Residuals  6 6.2004  1.0334


 On Thu, Apr 16, 2009 at 10:35 AM, kayj kjaj...@yahoo.com wrote:
 
  Hi,
 
 
  How can I find the p-value for the F test for the interaction terms in a
  regression linear model lm ?
 
  I appreciate your help
 
 
  --
  View this message in context:
  http://www.nabble.com/F-test-tp23078122p23078122.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tr.im/mikes_public_calendar

 ~ Certainty is folly... I think. ~

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Jun Shen PhD
 PK/PD Scientist
 BioPharma Services
 Millipore Corporation
 15 Research Park Dr.
 St Charles, MO 63304
 Direct: 636-720-1589





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] F test

2009-04-16 Thread Mike Lawrence

 Ahem. Equivalent, my tired foot...

My bad, I wasn't paying attention.

 May I suggest consulting a textbook *before* flunking ANOVA 101 ?

Harsh but warranted given my carelessness.


On Thu, Apr 16, 2009 at 3:47 PM, Emmanuel Charpentier
charp...@bacbuc.dyndns.org wrote:
 Le jeudi 16 avril 2009 à 14:08 -0300, Mike Lawrence a écrit :
 summary(my_lm) will give you t-values, anova(my_lm) will give you
 (equivalent) F-values.

 Ahem. Equivalent, my tired foot...

 In simple terms (the real real story may be more intricate) :

 The F values stated by anova are something entierely different of t
 values in summary. The latter allow you to assess properties of *one*
 coefficient in your model (namely, do I have enough suport to state that
 it is nonzero ?). The former allows you to assess whether you have
 support for stating that *ALL* the coefficient related to the same
 factor cannot be *SIMULTANEOUSLY* null. Which is a horse of quite
 another color...

 By the way : if your summary indeed does give you the mean^K^K an
 unbiased estimate of your coefficient and an (hopefully) unbiased
 estimate of its standard error, the F ration is the ratio of estimates
 of remaining variabilities with and without the H0 assumption it
 tests, that is that *ALL* coefficients of your factor of interest are
 *SIMULTANEOUSLY* null.

 F and t numbers will be equivalent if and only if your factor of
 interest needs only one coefficient to get expressed, i. e. is a
 continuous covariable or a two-class discrete variable (such as
 boolean). In this case, you can test your factor either by the t value
 which, under H0, fluctuates as a Student's t with n_res dof (n_res being
 the residual degrees of freedom of the model) or by the F value, which
 will fluctuate as a Fisher F statistic with 1 and n_res dof, which
 happens (but that's not happenstance...) to be the *square* of a t with
 n_dof.

 May I suggest consulting a textbook *before* flunking ANOVA 101 ?

                                        Emmanuel Charpentier

  summary() might be preferred because it also
 provides the estimates  SE.

  a=data.frame(dv=rnorm(10),iv1=rnorm(10),iv2=rnorm(10))
  my_lm=lm(dv~iv1*iv2,a)
  summary(my_lm)

 Call:
 lm(formula = dv ~ iv1 * iv2, data = a)

 Residuals:
     Min      1Q  Median      3Q     Max
 -1.8484 -0.2059  0.1627  0.4623  1.0401

 Coefficients:
             Estimate Std. Error t value Pr(|t|)
 (Intercept)  -0.4864     0.4007  -1.214    0.270
 iv1           0.8233     0.5538   1.487    0.188
 iv2           0.2314     0.3863   0.599    0.571
 iv1:iv2      -0.4110     0.5713  -0.719    0.499

 Residual standard error: 1.017 on 6 degrees of freedom
 Multiple R-squared: 0.3161,   Adjusted R-squared: -0.02592
 F-statistic: 0.9242 on 3 and 6 DF,  p-value: 0.4842

  anova(my_lm)
 Analysis of Variance Table

 Response: dv
           Df Sum Sq Mean Sq F value Pr(F)
 iv1        1 1.9149  1.9149  1.8530 0.2223
 iv2        1 0.4156  0.4156  0.4021 0.5494
 iv1:iv2    1 0.5348  0.5348  0.5175 0.4990
 Residuals  6 6.2004  1.0334


 On Thu, Apr 16, 2009 at 10:35 AM, kayj kjaj...@yahoo.com wrote:
 
  Hi,
 
 
  How can I find the p-value for the F test for the interaction terms in a
  regression linear model lm ?
 
  I appreciate your help
 
 
  --
  View this message in context: 
  http://www.nabble.com/F-test-tp23078122p23078122.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to apply a function to all combinations of the values of 2 or more variables?

2009-04-15 Thread Mike Lawrence

Check out plyr:

http://had.co.nz/plyr/

On Wed, Apr 15, 2009 at 2:03 PM, Lane, Jim jim.l...@rbc.com wrote:
 Hi, All

 Forgive me if this is a stupid newbie question. I'm having no luck
 googling an answer to this, probably because I don't know the right R
 terminology to frame my question. I want to know how to run an R
 function on each combination of the values of 2 or more variables. In
 SAS-speak this is multiple BY variables or CLASS variables.

 In R I've figured out how to do what I want with one by value.

 by(mf,mf$centre,summary)

 Centre is one of the columns in the data frame mf which looks like
 this:

 names(mf)
 [1] centre   complex  appl     pool     month    alloc_gb

 I'd like to analyze, for example, by complex within centre. This has a
 manageable number of combinations:

 table(mf$centre,mf$complex)

                   A      B      C      D      E      F      G      H
 I      J      K      L
  B         0      0      0      0      0      0      0      0      0
 0      0  60574      0
  G        44    209      0  94613      0    156      0   2541      0
 748      0      0 215511
  O         0      0      0      0      0      0   3446      0    676
 0  84400      0      0
  Q         0      0    298      0  66277      0      0      0      0
 0      0      0      0

 In SAS what I'm after is something like:

 Proc summary nway;
  class centre complex;
  var alloc_gb;
  output out=s sum=;
  run;

 How do I get something similar in R?

 Jim Lane
 Capacity Planner
 RBC Financial Group
 315 Front St W
 6th Floor - H14
 Toronto, Ontario CANADA
 M5V 3A4
 416-348-6024

 Display of superior knowledge is as great a vulgarity
 as display of superior wealth - greater indeed, inasmuch
 as knowledge should tend more definitely than wealth
 towards discretion and good manners.

 - H. W. Fowler, Modern English Usage


 ___

 This e-mail may be privileged and/or confidential, and the sender does not 
 waive any related rights and obligations.
 Any distribution, use or copying of this e-mail or the information it 
 contains by other than an intended recipient is unauthorized.
 If you received this e-mail in error, please advise me (by return e-mail or 
 otherwise) immediately.

 Ce courrier électronique est confidentiel et protégé. L'expéditeur ne renonce 
 pas aux droits et obligations qui s'y rapportent.
 Toute diffusion, utilisation ou copie de ce message ou des renseignements 
 qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) 
 est interdite.
 Si vous recevez ce courrier électronique par erreur, veuillez m'en aviser 
 immédiatement, par retour de courrier électronique ou par un autre moyen.

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind

2009-04-14 Thread Mike Lawrence

 a=c(NA,NA,2,3,NA,NA,NA)
 which.min(is.na(a))
[1] 3
 b=1:length(a)
 b-3
[1] -2 -1  0  1  2  3  4
 cbind(b=b-3,a=a)
   b  a
[1,] -2 NA
[2,] -1 NA
[3,]  0  2
[4,]  1  3
[5,]  2 NA
[6,]  3 NA
[7,]  4 NA



On Tue, Apr 14, 2009 at 7:20 AM, emj83 stp08...@shef.ac.uk wrote:

 I have a list of numbers with NAs as below:

 A[,1]
  [1]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [19]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [37]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [55]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [73]  NA  NA  NA  62  78  98  73  57  63  56  88  77 151 165 129  78  83
 63
  [91]  72  68  61  89  95  74  53  77  90 106 114 113  84  59  60  77  46
 91
 [109] 108 111  76  75  70  61  65  61  52  94  71  67  52  86  79  97  80
 101
 [127]  87  53  85  79  86 104 153 128 155 148  NA  NA  NA  NA  NA  NA  NA
 NA
 [145]  NA  NA  NA  NA  NA  NA  NA

 I would like to  bind a column to this list which begins at 0 when the first
 number has occured, but provides negative numbers prior to this as below:

  [1,] -75  NA
  [2,] -74  NA
  [3,] -73  NA
  [4,] -72  NA
  [5,] -71  NA
  [6,] -70  NA
  [7,] -69  NA
  [8,] -68  NA
  [9,] -67  NA
  [10,] -66  NA
  [11,] -65  NA
  [12,] -64  NA
  [13,] -63  NA
  [14,] -62  NA
  [15,] -61  NA
  [16,] -60  NA
  [17,] -59  NA
  [18,] -58  NA
  [19,] -57  NA
  [20,] -56  NA
  [21,] -55  NA
  [22,] -54  NA
  [23,] -53  NA
  [24,] -52  NA
  [25,] -51  NA
  [26,] -50  NA
  [27,] -49  NA
  [28,] -48  NA
  [29,] -47  NA
  [30,] -46  NA
  [31,] -45  NA
  [32,] -44  NA
  [33,] -43  NA
  [34,] -42  NA
  [35,] -41  NA
  [36,] -40  NA
  [37,] -39  NA
  [38,] -38  NA
  [39,] -37  NA
  [40,] -36  NA
  [41,] -35  NA
  [42,] -34  NA
  [43,] -33  NA
  [44,] -32  NA
  [45,] -31  NA
  [46,] -30  NA
  [47,] -29  NA
  [48,] -28  NA
  [49,] -27  NA
  [50,] -26  NA
  [51,] -25  NA
  [52,] -24  NA
  [53,] -23  NA
  [54,] -22  NA
  [55,] -21  NA
  [56,] -20  NA
  [57,] -19  NA
  [58,] -18  NA
  [59,] -17  NA
  [60,] -16  NA
  [61,] -15  NA
  [62,] -14  NA
  [63,] -13  NA
  [64,] -12  NA
  [65,] -11  NA
  [66,] -10  NA
  [67,]  -9  NA
  [68,]  -8  NA
  [69,]  -7  NA
  [70,]  -6  NA
  [71,]  -5  NA
  [72,]  -4  NA
  [73,]  -3  NA
  [74,]  -2  NA
  [75,]  -1  NA
  [76,]   0  62
  [77,]   1  78
  [78,]   2  98
  [79,]   3  73
  [80,]   4  57
  [81,]   5  63
  [82,]   6  56
  [83,]   7  88
  [84,]   8  77
  [85,]   9 151
  [86,]  10 165
  [87,]  11 129
  [88,]  12  78
  [89,]  13  83
  [90,]  14  63
  [91,]  15  72
  [92,]  16  68
  [93,]  17  61
  [94,]  18  89
  [95,]  19  95
  [96,]  20  74
  [97,]  21  53
  [98,]  22  77
  [99,]  23  90
 [100,]  24 106
 [101,]  25 114
 [102,]  26 113
 [103,]  27  84
 [104,]  28  59
 [105,]  29  60
 [106,]  30  77
 [107,]  31  46
 [108,]  32  91
 [109,]  33 108
 [110,]  34 111
 [111,]  35  76
 [112,]  36  75
 [113,]  37  70
 [114,]  38  61
 [115,]  39  65
 [116,]  40  61
 [117,]  41  52
 [118,]  42  94
 [119,]  43  71
 [120,]  44  67
 [121,]  45  52
 [122,]  46  86
 [123,]  47  79
 [124,]  48  97
 [125,]  49  80
 [126,]  50 101
 [127,]  51  87
 [128,]  52  53
 [129,]  53  85
 [130,]  54  79
 [131,]  55  86
 [132,]  56 104
 [133,]  57 153
 [134,]  58 128
 [135,]  59 155
 [136,]  60 148
 [137,]  61  NA
 [138,]  62  NA
 [139,]  63  NA
 [140,]  64  NA
 [141,]  65  NA
 [142,]  66  NA
 [143,]  67  NA
 [144,]  68  NA
 [145,]  69  NA
 [146,]  70  NA
 [147,]  71  NA
 [148,]  72  NA
 [149,]  73  NA
 [150,]  74  NA
 [151,]  75  NA

 could anyone help me to with a function that would be able to calculate the
 sequence I require to bind to the initial sequence?

 thanks emma
 --
 View this message in context: 
 http://www.nabble.com/cbind-tp23036759p23036759.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind

2009-04-14 Thread Mike Lawrence

oops, of course I meant:

a=c(NA,NA,2,3,NA,NA,NA)

b=1:length(a)
cbind(b=b-which.min(is.na(a)),a=a)

On Tue, Apr 14, 2009 at 9:43 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 a=c(NA,NA,2,3,NA,NA,NA)
 which.min(is.na(a))
 [1] 3
 b=1:length(a)
 b-3
 [1] -2 -1  0  1  2  3  4
 cbind(b=b-3,a=a)
       b  a
 [1,] -2 NA
 [2,] -1 NA
 [3,]  0  2
 [4,]  1  3
 [5,]  2 NA
 [6,]  3 NA
 [7,]  4 NA



 On Tue, Apr 14, 2009 at 7:20 AM, emj83 stp08...@shef.ac.uk wrote:

 I have a list of numbers with NAs as below:

 A[,1]
  [1]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [19]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [37]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [55]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [73]  NA  NA  NA  62  78  98  73  57  63  56  88  77 151 165 129  78  83
 63
  [91]  72  68  61  89  95  74  53  77  90 106 114 113  84  59  60  77  46
 91
 [109] 108 111  76  75  70  61  65  61  52  94  71  67  52  86  79  97  80
 101
 [127]  87  53  85  79  86 104 153 128 155 148  NA  NA  NA  NA  NA  NA  NA
 NA
 [145]  NA  NA  NA  NA  NA  NA  NA

 I would like to  bind a column to this list which begins at 0 when the first
 number has occured, but provides negative numbers prior to this as below:

  [1,] -75  NA
  [2,] -74  NA
  [3,] -73  NA
  [4,] -72  NA
  [5,] -71  NA
  [6,] -70  NA
  [7,] -69  NA
  [8,] -68  NA
  [9,] -67  NA
  [10,] -66  NA
  [11,] -65  NA
  [12,] -64  NA
  [13,] -63  NA
  [14,] -62  NA
  [15,] -61  NA
  [16,] -60  NA
  [17,] -59  NA
  [18,] -58  NA
  [19,] -57  NA
  [20,] -56  NA
  [21,] -55  NA
  [22,] -54  NA
  [23,] -53  NA
  [24,] -52  NA
  [25,] -51  NA
  [26,] -50  NA
  [27,] -49  NA
  [28,] -48  NA
  [29,] -47  NA
  [30,] -46  NA
  [31,] -45  NA
  [32,] -44  NA
  [33,] -43  NA
  [34,] -42  NA
  [35,] -41  NA
  [36,] -40  NA
  [37,] -39  NA
  [38,] -38  NA
  [39,] -37  NA
  [40,] -36  NA
  [41,] -35  NA
  [42,] -34  NA
  [43,] -33  NA
  [44,] -32  NA
  [45,] -31  NA
  [46,] -30  NA
  [47,] -29  NA
  [48,] -28  NA
  [49,] -27  NA
  [50,] -26  NA
  [51,] -25  NA
  [52,] -24  NA
  [53,] -23  NA
  [54,] -22  NA
  [55,] -21  NA
  [56,] -20  NA
  [57,] -19  NA
  [58,] -18  NA
  [59,] -17  NA
  [60,] -16  NA
  [61,] -15  NA
  [62,] -14  NA
  [63,] -13  NA
  [64,] -12  NA
  [65,] -11  NA
  [66,] -10  NA
  [67,]  -9  NA
  [68,]  -8  NA
  [69,]  -7  NA
  [70,]  -6  NA
  [71,]  -5  NA
  [72,]  -4  NA
  [73,]  -3  NA
  [74,]  -2  NA
  [75,]  -1  NA
  [76,]   0  62
  [77,]   1  78
  [78,]   2  98
  [79,]   3  73
  [80,]   4  57
  [81,]   5  63
  [82,]   6  56
  [83,]   7  88
  [84,]   8  77
  [85,]   9 151
  [86,]  10 165
  [87,]  11 129
  [88,]  12  78
  [89,]  13  83
  [90,]  14  63
  [91,]  15  72
  [92,]  16  68
  [93,]  17  61
  [94,]  18  89
  [95,]  19  95
  [96,]  20  74
  [97,]  21  53
  [98,]  22  77
  [99,]  23  90
 [100,]  24 106
 [101,]  25 114
 [102,]  26 113
 [103,]  27  84
 [104,]  28  59
 [105,]  29  60
 [106,]  30  77
 [107,]  31  46
 [108,]  32  91
 [109,]  33 108
 [110,]  34 111
 [111,]  35  76
 [112,]  36  75
 [113,]  37  70
 [114,]  38  61
 [115,]  39  65
 [116,]  40  61
 [117,]  41  52
 [118,]  42  94
 [119,]  43  71
 [120,]  44  67
 [121,]  45  52
 [122,]  46  86
 [123,]  47  79
 [124,]  48  97
 [125,]  49  80
 [126,]  50 101
 [127,]  51  87
 [128,]  52  53
 [129,]  53  85
 [130,]  54  79
 [131,]  55  86
 [132,]  56 104
 [133,]  57 153
 [134,]  58 128
 [135,]  59 155
 [136,]  60 148
 [137,]  61  NA
 [138,]  62  NA
 [139,]  63  NA
 [140,]  64  NA
 [141,]  65  NA
 [142,]  66  NA
 [143,]  67  NA
 [144,]  68  NA
 [145,]  69  NA
 [146,]  70  NA
 [147,]  71  NA
 [148,]  72  NA
 [149,]  73  NA
 [150,]  74  NA
 [151,]  75  NA

 could anyone help me to with a function that would be able to calculate the
 sequence I require to bind to the initial sequence?

 thanks emma
 --
 View this message in context: 
 http://www.nabble.com/cbind-tp23036759p23036759.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tr.im/mikes_public_calendar

 ~ Certainty is folly... I think. ~




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained

Re: [R] cbind

2009-04-14 Thread Mike Lawrence

Unfortunately Dimitris' solution fails in the face of NA padding on
both sides of the numeric data, as in Emma's original example.

x - c(rep(NA, 20), sample(100, 25), rep(NA,20))


On Tue, Apr 14, 2009 at 9:40 AM, Dimitris Rizopoulos
d.rizopou...@erasmusmc.nl wrote:
 try this:

 x - c(rep(NA, 20), sample(100, 25))

 n.na - sum(is.na(x))
 cbind(seq(-n.na, length(x) - n.na - 1), x)


 I hope it helps.

 Best,
 Dimitris


 emj83 wrote:

 I have a list of numbers with NAs as below:

 A[,1]

  [1]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [19]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [37]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [55]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
 NA
  [73]  NA  NA  NA  62  78  98  73  57  63  56  88  77 151 165 129  78  83
 63
  [91]  72  68  61  89  95  74  53  77  90 106 114 113  84  59  60  77  46
 91
 [109] 108 111  76  75  70  61  65  61  52  94  71  67  52  86  79  97  80
 101
 [127]  87  53  85  79  86 104 153 128 155 148  NA  NA  NA  NA  NA  NA  NA
 NA
 [145]  NA  NA  NA  NA  NA  NA  NA

 I would like to  bind a column to this list which begins at 0 when the
 first
 number has occured, but provides negative numbers prior to this as below:

  [1,] -75  NA
  [2,] -74  NA
  [3,] -73  NA
  [4,] -72  NA
  [5,] -71  NA
  [6,] -70  NA
  [7,] -69  NA
  [8,] -68  NA
  [9,] -67  NA
  [10,] -66  NA
  [11,] -65  NA
  [12,] -64  NA
  [13,] -63  NA
  [14,] -62  NA
  [15,] -61  NA
  [16,] -60  NA
  [17,] -59  NA
  [18,] -58  NA
  [19,] -57  NA
  [20,] -56  NA
  [21,] -55  NA
  [22,] -54  NA
  [23,] -53  NA
  [24,] -52  NA
  [25,] -51  NA
  [26,] -50  NA
  [27,] -49  NA
  [28,] -48  NA
  [29,] -47  NA
  [30,] -46  NA
  [31,] -45  NA
  [32,] -44  NA
  [33,] -43  NA
  [34,] -42  NA
  [35,] -41  NA
  [36,] -40  NA
  [37,] -39  NA
  [38,] -38  NA
  [39,] -37  NA
  [40,] -36  NA
  [41,] -35  NA
  [42,] -34  NA
  [43,] -33  NA
  [44,] -32  NA
  [45,] -31  NA
  [46,] -30  NA
  [47,] -29  NA
  [48,] -28  NA
  [49,] -27  NA
  [50,] -26  NA
  [51,] -25  NA
  [52,] -24  NA
  [53,] -23  NA
  [54,] -22  NA
  [55,] -21  NA
  [56,] -20  NA
  [57,] -19  NA
  [58,] -18  NA
  [59,] -17  NA
  [60,] -16  NA
  [61,] -15  NA
  [62,] -14  NA
  [63,] -13  NA
  [64,] -12  NA
  [65,] -11  NA
  [66,] -10  NA
  [67,]  -9  NA
  [68,]  -8  NA
  [69,]  -7  NA
  [70,]  -6  NA
  [71,]  -5  NA
  [72,]  -4  NA
  [73,]  -3  NA
  [74,]  -2  NA
  [75,]  -1  NA
  [76,]   0  62
  [77,]   1  78
  [78,]   2  98
  [79,]   3  73
  [80,]   4  57
  [81,]   5  63
  [82,]   6  56
  [83,]   7  88
  [84,]   8  77
  [85,]   9 151
  [86,]  10 165
  [87,]  11 129
  [88,]  12  78
  [89,]  13  83
  [90,]  14  63
  [91,]  15  72
  [92,]  16  68
  [93,]  17  61
  [94,]  18  89
  [95,]  19  95
  [96,]  20  74
  [97,]  21  53
  [98,]  22  77
  [99,]  23  90
 [100,]  24 106
 [101,]  25 114
 [102,]  26 113
 [103,]  27  84
 [104,]  28  59
 [105,]  29  60
 [106,]  30  77
 [107,]  31  46
 [108,]  32  91
 [109,]  33 108
 [110,]  34 111
 [111,]  35  76
 [112,]  36  75
 [113,]  37  70
 [114,]  38  61
 [115,]  39  65
 [116,]  40  61
 [117,]  41  52
 [118,]  42  94
 [119,]  43  71
 [120,]  44  67
 [121,]  45  52
 [122,]  46  86
 [123,]  47  79
 [124,]  48  97
 [125,]  49  80
 [126,]  50 101
 [127,]  51  87
 [128,]  52  53
 [129,]  53  85
 [130,]  54  79
 [131,]  55  86
 [132,]  56 104
 [133,]  57 153
 [134,]  58 128
 [135,]  59 155
 [136,]  60 148
 [137,]  61  NA
 [138,]  62  NA
 [139,]  63  NA
 [140,]  64  NA
 [141,]  65  NA
 [142,]  66  NA
 [143,]  67  NA
 [144,]  68  NA
 [145,]  69  NA
 [146,]  70  NA
 [147,]  71  NA
 [148,]  72  NA
 [149,]  73  NA
 [150,]  74  NA
 [151,]  75  NA

 could anyone help me to with a function that would be able to calculate
 the
 sequence I require to bind to the initial sequence?

 thanks emma

 --
 Dimitris Rizopoulos
 Assistant Professor
 Department of Biostatistics
 Erasmus University Medical Center

 Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
 Tel: +31/(0)10/7043478
 Fax: +31/(0)10/7043014

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Group by in R

2009-04-13 Thread Mike Lawrence

One way:

g= paste(f1,f2,f3,f4)
table(g)

On Mon, Apr 13, 2009 at 7:33 AM, Nick Angelou nikola...@yahoo.com wrote:

 Hi,

 I have the following table data:

 f1, f2, f3, f4.

 I want to compute the counts of unique combinations of f1-f4. In SQL I would
 just write:

 SELECT COUNT(*) FROM table GROUP BY f1, f2, ..,f4.

 How to do this in R?

 Thanks,

 Nick
 --
 View this message in context: 
 http://www.nabble.com/Group-by-in-R-tp23020587p23020587.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weighted mean and by() with two index

2009-04-13 Thread Mike Lawrence

Sounds like a job for plyr: http://had.co.nz/plyr


On Mon, Apr 13, 2009 at 7:56 PM, Dong H. Oh r.arec...@gmail.com wrote:
 Hi expeRts,

 I would like to calculate weighted mean by two factors.

 My code is as follows:

 R tmp - by(re$meta.sales.lkm[, c(pc, sales)],
                      re$meta.sales.lkm[, c(size, yr)], function(x)
                      weighted.mean(x[,1], x[,2]))

 The result is as follows:
 R tmp
 size: micro
 yr: 1994
 [1] 1.090
 
 size: small
 yr: 1994
 [1] 1.135
 
 size: medium
 yr: 1994
 [1] 1.113
 
 size: large
 yr: 1994
 [1] 1.105
 
 size: micro
 yr: 1995
 [1] 1.167
 
 size: small
 yr: 1995
 [1] 1.096
 
 size: medium
 yr: 1995
 [1] 1.056
 
 

 But the form I want to get is as follows:
            1994       1995         1996      .
 micro    1.090      1.167         .
 small     1.135      1.096
 medium 1.113      1.056         
 large      1.105      ... ...

 That is, the result should be tabularized.
 How can I get the above form directly? (I don't want to modify tmp with
 as.vector() and matrix() to get the result)

 Thank  you in advance.

 --
 Donghyun Oh
 CESIS, KTH
 --

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] taking the log then later exponentiate the result query

2009-04-12 Thread Mike Lawrence

Your problem is that with the alpha  beta you've specified

(((alpha-1)^(alpha-1))*((beta-1)^(beta-1)))/((beta+alpha-2)^(alpha+beta-2))

is

Inf/Inf

which is NaN.



On Sun, Apr 12, 2009 at 5:39 PM, Mary Winter statsstud...@hotmail.com wrote:


  Hi,

 I am trying to figure out the observed acceptance rate and M, using 
 generalised rejection sampling to generate a sample from the posterior 
 distribution for p.

 I have been told my code doesn't work because I need to  take the log of the 
 expression for M, evaluate it and then exponentiate the result. This is 
 because R is unable to calculate high powers such as 545.501.

 As you can see in my code I have tried to taking the log of M and then the 
 exponential of the result, but I clearly must be doing something wrong.
 I keep getting the error message:

 Error in if (U = ratio/exp(M)) { : missing value where TRUE/FALSE needed

 Any ideas how I go about correctly taking the log and then the exponential?

 rvonmises.norm - function(n,alpha,beta) {
 out - rep(0,n)
 counter - 0
 total.sim - 0
 p-alpha/(alpha+beta)
 M 
 -logalpha-1)^(alpha-1))*((beta-1)^(beta-1)))/((beta+alpha-2)^(alpha+beta-2)))
 while(counter  n) {
 total.sim - total.sim+1
 proposal - runif(1)
 if(proposal = 0  proposal = 1) {
 U - runif(1)
 ratio- (p^(alpha-1))*((1-p)^(beta-1))
 if(U =ratio/exp(M)) {
 counter - counter+1
 out[counter] - proposal
 }
 }
 }
 obs.acc.rate - n/total.sim
 return(out,obs.acc.rate,M)
 }
 set.seed(220)
 temp - rvonmises.norm(1,439.544,545.501)
 print(temp$obs.acc.rate)

 Louisa



 Get the New Internet Explore 8 Optimised for MSN. Download Now

 _


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quantative procedure assessing if data is normal

2009-04-12 Thread Mike Lawrence

www.rseek.org
normality test

On Sun, Apr 12, 2009 at 8:45 PM, Henry Cooper henry.1...@hotmail.co.uk wrote:

 Hi,



 As part of an R code assingment I have been asked to find a quantitative 
 procedure for assessing whether or not the data are normal?



 I have previously used the graphical procedure using the qqnorm command.



 Any help/tips would be greatly appreciated as to how I should start going 
 about this!



 Henry





 _


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Genstat into R - Randomisation test

2009-04-08 Thread Mike Lawrence

Looks like that code implements a non-exhaustive variant of the
randomization test, sometimes called a permutation test. Assuming you
have a data frame with columns ID  CD, this should do it:

n.obs = 100
ID=rnorm(n.obs)
CD=rnorm(n.obs)

obs.cor = cor(ID,CD)

num.permutations = 1e4
perm.cor = rep(NA,num.permutations)
start.time=proc.time()[1]
index = 1:n.obs
for(i in 1:num.permutations){
IDorder = sample(index)
CDorder = sample(index)
perm.cor[i] = .Internal(cor(ID[IDorder], CD[CDorder], 4, FALSE))
}
cat('Elapsed time:',start.time-proc.time(1))
sum(perm.corobs.cor)/num.permutations


On Wed, Apr 8, 2009 at 10:02 AM, Anne Kempel kem...@ips.unibe.ch wrote:
 Hello everybody,
 I have a question. I would like to get a correlation between constitutive
 and induced plant defence which I messured on 30 plant species. So I have
 table with Species, Induced defence (ID), and constitutive defence (CD).
 Since Induced and constitutive defence are not independant (so called
 spurious correlation) I should do a randomisation test. I have a syntax of
 my supervisor in Genstat, but I would really like to try this in R.


 data from trade-off.IDCD
 list
 variate [nval=1000] slope
 calc ID1=ID

 graph ID; CD
 calc b=corr(ID; CD)
 calc slope$[1]=b

 slope$[1] is the correlation before permutating the data

 for i=2...1000
    randomize ID1
    calc b=corr(CD1; ID1)
    calc slope$[i]=b
 endfor

 hist slope
 describe slope
 quantile [proportion=!(0.0005,0.005, 0.025, 0.975, 0.995,0.9995)]slope
 print slope$[1]
 corr[p=corr] ID,CD


 DHISTOGRAM [WINDOW=1; ORIENTATION=vertical; KEY=0; BARCOVERING=1.0] slope;
 PEN=2

 Does anybody have done something similar and has any idea how to make the
 randomisation part?
 I would be very grateful for any help!!
 Thanks in advance,
 Anne




 --
 Anne Kempel
 Institute of Plant Sciences
 University of Bern
 Altenbergrain 21
 CH-3103 Bern
 kem...@ips.unibe.ch

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Genstat into R - Randomisation test

2009-04-08 Thread Mike Lawrence

Assuming you have a data frame with columns ID  CD, this should do it

Oops, the code I sent doesn't assume this. It assumes that you have
two vectors, ID  CD, as generated in the first 3 lines.

On Wed, Apr 8, 2009 at 11:31 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 Looks like that code implements a non-exhaustive variant of the
 randomization test, sometimes called a permutation test. Assuming you
 have a data frame with columns ID  CD, this should do it:

 n.obs = 100
 ID=rnorm(n.obs)
 CD=rnorm(n.obs)

 obs.cor = cor(ID,CD)

 num.permutations = 1e4
 perm.cor = rep(NA,num.permutations)
 start.time=proc.time()[1]
 index = 1:n.obs
 for(i in 1:num.permutations){
        IDorder = sample(index)
        CDorder = sample(index)
        perm.cor[i] = .Internal(cor(ID[IDorder], CD[CDorder], 4, FALSE))
 }
 cat('Elapsed time:',start.time-proc.time(1))
 sum(perm.corobs.cor)/num.permutations


 On Wed, Apr 8, 2009 at 10:02 AM, Anne Kempel kem...@ips.unibe.ch wrote:
 Hello everybody,
 I have a question. I would like to get a correlation between constitutive
 and induced plant defence which I messured on 30 plant species. So I have
 table with Species, Induced defence (ID), and constitutive defence (CD).
 Since Induced and constitutive defence are not independant (so called
 spurious correlation) I should do a randomisation test. I have a syntax of
 my supervisor in Genstat, but I would really like to try this in R.


 data from trade-off.IDCD
 list
 variate [nval=1000] slope
 calc ID1=ID

 graph ID; CD
 calc b=corr(ID; CD)
 calc slope$[1]=b

 slope$[1] is the correlation before permutating the data

 for i=2...1000
    randomize ID1
    calc b=corr(CD1; ID1)
    calc slope$[i]=b
 endfor

 hist slope
 describe slope
 quantile [proportion=!(0.0005,0.005, 0.025, 0.975, 0.995,0.9995)]slope
 print slope$[1]
 corr[p=corr] ID,CD


 DHISTOGRAM [WINDOW=1; ORIENTATION=vertical; KEY=0; BARCOVERING=1.0] slope;
 PEN=2

 Does anybody have done something similar and has any idea how to make the
 randomisation part?
 I would be very grateful for any help!!
 Thanks in advance,
 Anne




 --
 Anne Kempel
 Institute of Plant Sciences
 University of Bern
 Altenbergrain 21
 CH-3103 Bern
 kem...@ips.unibe.ch

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Genstat into R - Randomisation test

2009-04-08 Thread Mike Lawrence

*Hits head*
Of course, the approach taken by your Genstat code of only shuffling
one variable is sufficient and faster:

n.obs = 100
ID=rnorm(n.obs)
CD=rnorm(n.obs)

obs.cor = cor(ID,CD)

num.permutations = 1e4
perm.cor = rep(NA,num.permutations)
start.time=proc.time()[1]
index = 1:n.obs
for(i in 1:num.permutations){
IDorder = sample(index)
perm.cor[i] = .Internal(cor(ID[IDorder], CD, 4, FALSE))
}
cat('Elapsed time:',start.time-proc.time(1))
sum(perm.corobs.cor)/num.permutations


On Wed, Apr 8, 2009 at 11:33 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 Assuming you have a data frame with columns ID  CD, this should do it

 Oops, the code I sent doesn't assume this. It assumes that you have
 two vectors, ID  CD, as generated in the first 3 lines.

 On Wed, Apr 8, 2009 at 11:31 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 Looks like that code implements a non-exhaustive variant of the
 randomization test, sometimes called a permutation test. Assuming you
 have a data frame with columns ID  CD, this should do it:

 n.obs = 100
 ID=rnorm(n.obs)
 CD=rnorm(n.obs)

 obs.cor = cor(ID,CD)

 num.permutations = 1e4
 perm.cor = rep(NA,num.permutations)
 start.time=proc.time()[1]
 index = 1:n.obs
 for(i in 1:num.permutations){
        IDorder = sample(index)
        CDorder = sample(index)
        perm.cor[i] = .Internal(cor(ID[IDorder], CD[CDorder], 4, FALSE))
 }
 cat('Elapsed time:',start.time-proc.time(1))
 sum(perm.corobs.cor)/num.permutations


 On Wed, Apr 8, 2009 at 10:02 AM, Anne Kempel kem...@ips.unibe.ch wrote:
 Hello everybody,
 I have a question. I would like to get a correlation between constitutive
 and induced plant defence which I messured on 30 plant species. So I have
 table with Species, Induced defence (ID), and constitutive defence (CD).
 Since Induced and constitutive defence are not independant (so called
 spurious correlation) I should do a randomisation test. I have a syntax of
 my supervisor in Genstat, but I would really like to try this in R.


 data from trade-off.IDCD
 list
 variate [nval=1000] slope
 calc ID1=ID

 graph ID; CD
 calc b=corr(ID; CD)
 calc slope$[1]=b

 slope$[1] is the correlation before permutating the data

 for i=2...1000
    randomize ID1
    calc b=corr(CD1; ID1)
    calc slope$[i]=b
 endfor

 hist slope
 describe slope
 quantile [proportion=!(0.0005,0.005, 0.025, 0.975, 0.995,0.9995)]slope
 print slope$[1]
 corr[p=corr] ID,CD


 DHISTOGRAM [WINDOW=1; ORIENTATION=vertical; KEY=0; BARCOVERING=1.0] slope;
 PEN=2

 Does anybody have done something similar and has any idea how to make the
 randomisation part?
 I would be very grateful for any help!!
 Thanks in advance,
 Anne




 --
 Anne Kempel
 Institute of Plant Sciences
 University of Bern
 Altenbergrain 21
 CH-3103 Bern
 kem...@ips.unibe.ch

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Genstat into R - Randomisation test

2009-04-08 Thread Mike Lawrence

*sigh*

and the elapsed time line should be:

cat('Elapsed time:',proc.time()[1]-start.time)


On Wed, Apr 8, 2009 at 11:35 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 *Hits head*
 Of course, the approach taken by your Genstat code of only shuffling
 one variable is sufficient and faster:

 n.obs = 100
 ID=rnorm(n.obs)
 CD=rnorm(n.obs)

 obs.cor = cor(ID,CD)

 num.permutations = 1e4
 perm.cor = rep(NA,num.permutations)
 start.time=proc.time()[1]
 index = 1:n.obs
 for(i in 1:num.permutations){
        IDorder = sample(index)
        perm.cor[i] = .Internal(cor(ID[IDorder], CD, 4, FALSE))
 }
 cat('Elapsed time:',start.time-proc.time(1))
 sum(perm.corobs.cor)/num.permutations


 On Wed, Apr 8, 2009 at 11:33 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 Assuming you have a data frame with columns ID  CD, this should do it

 Oops, the code I sent doesn't assume this. It assumes that you have
 two vectors, ID  CD, as generated in the first 3 lines.

 On Wed, Apr 8, 2009 at 11:31 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 Looks like that code implements a non-exhaustive variant of the
 randomization test, sometimes called a permutation test. Assuming you
 have a data frame with columns ID  CD, this should do it:

 n.obs = 100
 ID=rnorm(n.obs)
 CD=rnorm(n.obs)

 obs.cor = cor(ID,CD)

 num.permutations = 1e4
 perm.cor = rep(NA,num.permutations)
 start.time=proc.time()[1]
 index = 1:n.obs
 for(i in 1:num.permutations){
        IDorder = sample(index)
        CDorder = sample(index)
        perm.cor[i] = .Internal(cor(ID[IDorder], CD[CDorder], 4, FALSE))
 }
 cat('Elapsed time:',start.time-proc.time(1))
 sum(perm.corobs.cor)/num.permutations


 On Wed, Apr 8, 2009 at 10:02 AM, Anne Kempel kem...@ips.unibe.ch wrote:
 Hello everybody,
 I have a question. I would like to get a correlation between constitutive
 and induced plant defence which I messured on 30 plant species. So I have
 table with Species, Induced defence (ID), and constitutive defence (CD).
 Since Induced and constitutive defence are not independant (so called
 spurious correlation) I should do a randomisation test. I have a syntax of
 my supervisor in Genstat, but I would really like to try this in R.


 data from trade-off.IDCD
 list
 variate [nval=1000] slope
 calc ID1=ID

 graph ID; CD
 calc b=corr(ID; CD)
 calc slope$[1]=b

 slope$[1] is the correlation before permutating the data

 for i=2...1000
    randomize ID1
    calc b=corr(CD1; ID1)
    calc slope$[i]=b
 endfor

 hist slope
 describe slope
 quantile [proportion=!(0.0005,0.005, 0.025, 0.975, 0.995,0.9995)]slope
 print slope$[1]
 corr[p=corr] ID,CD


 DHISTOGRAM [WINDOW=1; ORIENTATION=vertical; KEY=0; BARCOVERING=1.0] slope;
 PEN=2

 Does anybody have done something similar and has any idea how to make the
 randomisation part?
 I would be very grateful for any help!!
 Thanks in advance,
 Anne




 --
 Anne Kempel
 Institute of Plant Sciences
 University of Bern
 Altenbergrain 21
 CH-3103 Bern
 kem...@ips.unibe.ch

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Genstat into R - Randomisation test

2009-04-08 Thread Mike Lawrence

I was taught that Fisher proposed the F-test as a computationally
simpler approximation to what he called a Randomization test,
consisting of exhaustive permutations. I never looked at the original
Fisher reference myself, so this may be false.

However, I haven't observed a consistent nomenclature when I have seen
these tests discussed, so I typically ensure to mention whether what
I'm doing is exhaustive or non-exhaustive.

I do see the value in your interpretation, and think it makes sense to
drop randomization  as a name (despite it's possible historical
significance) and start using exhaustive permutation test (to
contrast with non-exhaustive permutation test).


On Wed, Apr 8, 2009 at 1:18 PM, Peter Dalgaard p.dalga...@biostat.ku.dk wrote:
 Mike Lawrence wrote:

 Looks like that code implements a non-exhaustive variant of the
 randomization test, sometimes called a permutation test.

 Isn't it the other way around? (Permutation tests can be exhaustive by
 looking at all permutations, if a randomization test did that, then it
 wouldn't be random.)


 --
   O__   Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
 ~~ - (p.dalga...@biostat.ku.dk)              FAX: (+45) 35327907




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] threshold distribution

2009-04-05 Thread Mike Lawrence

You didn't properly specify the x-axis coordinates in your call to lines():

plot(density(x))
lines(x,dlnorm(x, -.1348, .1911), col='red')
points(x,dlnorm(x, -.1348, .1911), col='blue')
curve(dlnorm(x, -.1348, .1911), col='green',add=T)


On Sun, Apr 5, 2009 at 6:40 AM, Bernardo Rangel Tura
t...@centroin.com.br wrote:
 On Sun, 2009-04-05 at 01:13 -0400, jim holtman wrote:
 Here is what I get from using 'fitdistr' in R to fit to a lognormal.
 The resulting density plot from the distribution seems to be a reason
 match to the data.

  x - scan()
 1: 0.80010 0.72299 0.69893 0.99597 0.89200 0.69312 0.73613 1.13559
 9: 0.85009 0.85804 0.73324 1.04826 0.84002
 14: 1.76330 0.71980 0.89416 0.89450 0.98670 0.83571 0.73833 0.66549
 22: 0.93641 0.80418 0.95285 0.76876 0.82588
 27: 1.09394 1.00195 1.14976 0.80008 1.11947 1.09484 0.81494 0.68696
 35: 0.82364 0.84390 0.71402 0.80293 1.02873
 40:
 Read 39 items
  plot(density(x))
  library(MASS)
  fitdistr(x, 'lognormal')
      meanlog        sdlog
   -0.13480636    0.19118861
  ( 0.03061468) ( 0.02164785)
  lines(dlnorm(x, -.1348, .1911), col='red')

 Hi Jim

 I agree with your solution but my plot result not fine.
 I obtain same result.
 fitdistr(x, 'lognormal')
     meanlog        sdlog
  -0.13480636    0.19118861
  ( 0.03061468) ( 0.02164785)

 In plot when I use points (blue) and curve (green) the fit o lognormal
 and density(data) is fine but when I use lines (red) the fit is bad (in
 attach I send a PDF of my output)

 Do you know why this  happen?

 Thanks in advance

 --
 Bernardo Rangel Tura, M.D,MPH,Ph.D
 National Institute of Cardiology
 Brazil

 P.S. my script is

 x - scan()
 0.80010 0.72299 0.69893 0.99597 0.89200 0.69312 0.73613 1.13559
 0.85009 0.85804 0.73324 1.04826 0.84002
 1.76330 0.71980 0.89416 0.89450 0.98670 0.83571 0.73833 0.66549
 0.93641 0.80418 0.95285 0.76876 0.82588
 1.09394 1.00195 1.14976 0.80008 1.11947 1.09484 0.81494 0.68696
 0.82364 0.84390 0.71402 0.80293 1.02873

 require(MASS)
 fitdistr(x, 'lognormal')
 pdf(adj.pdf)
 plot(density(x))
 lines(dlnorm(x, -.1348, .1911), col='red')
 points(x,dlnorm(x, -.1348, .1911), col='blue')
 curve(dlnorm(x, -.1348, .1911), col='green',add=T)
 dev.off()



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] repeated measures ANOVA - among group differences

2009-04-01 Thread Mike Lawrence

If Month is nested within Quadrat I think you want:
aov(ProportioninTreatment ~ Treatment*Month +Error(Quadrat/Month), RM)

If Treatment is also nested within Quadrat, you want:
aov(ProportioninTreatment ~ Treatment*Month
+Error(Quadrat/(Treatment*Month)), RM)


On Wed, Apr 1, 2009 at 12:42 AM, Jessica L Hite/hitejl/O/VCU
hit...@vcu.edu wrote:


 I have data on the proportion of clutches experiencing different fates
 (e.g., 4 different sources of mortality) for 5 months . I need to test 1)
 if the overall proportion of these different fates is different over the
 entire study and 2) to see if there are monthly differences within (and
 among) fate types. Thus, I am pretty sure this is an RM analysis -( I
 measure the same quadrats each month).

 I am fine running the analysis in R - with the code below, however, there
 is no output for the among group variation...this is an important component
 - any ideas on how to solve this problem?

 I have included code and sample data below.

 Many thanks in advance for help and suggestions.

 J

  both.aov - aov(ProportioninTreatment ~ factor(Treatment)*factor(Month) +
 Error(factor(Quadrat)), RM)

 Error: factor(id)
          Df  Sum Sq Mean Sq F value Pr(F)
 Residuals  3 0.51619 0.17206               #why only partial output
 here? ###

 Error: Within
                   Df Sum Sq Mean Sq F value   Pr(F)
 factor(Fate1)       3 1.2453  0.4151  3.5899 0.017907 *
 time                1 0.9324  0.9324  8.0637 0.005929 **
 factor(Fate1):time  3 0.9978  0.3326  2.8763 0.042272 *
 Residuals          69 7.9783  0.1156




 Fate1 Proportion in Fate      ASIN  Month Quadrat
 1     0.117647059 0.350105778 1     1
 1     0     0     2     1
 1     0.1 0.339836909 3     1
 1     0     0     4     1
 1     0     0     5     1
 1     0     0     1     2
 1     0     0     2     2
 1     0.2   0.463647609 3     2
 1     0.25  0.523598776 4     2
 1     0.1 0.339836909 5     2
 1     0     0     1     3
 1     0     0     2     3
 1     0     0     3     3
 1     0.384615385 0.668964075 4     3
 1     0     0     5     3
 1     0     0     1     4
 1     0     0     2     4
 1     0     0     3     4
 1     0.16667 0.420534336 4     4
 1     0     0     5     4
 2     0.352941176 0.636132062 1     1
 2     0.2   0.463647609 2     1
 2     0.3 0.615479708 3     1
 2     1     1.570796327 4     1
 2     0     0     5     1
 2     0.5   0.785398163 1     2
 2     0     0     2     2
 2     0.6   0.886077124 3     2
 2     0.41667 0.701674124 4     2
 2     0.2 0.490882678 5     2
 2     0     0     1     3
 2     0.2   0.463647609 2     3
 2     0     0     3     3
 2     0.461538462 0.746898594 4     3
 2     0     0     5     3
 2     0     0     1     4
 2     0     0     2     4
 2     0.307692308 0.588002604 3     4
 2     0.7 0.955316618 4     4
 2     0     0     5     4
 3     0     0     1     1
 3     0     0     2     1
 3     0.4 0.729727656 3     1
 3     0     0     4     1
 3     1     1.570796327 5     1
 3     0.5   0.785398163 1     2
 3     0     0     2     2
 3     0     0     3     2
 3     0.25  0.523598776 4     2
 3     0.6 0.841068671 5     2
 3     0     0     1     3
 3     0     0     2     3
 3     0     0     3     3
 3     0.153846154 0.403057075 4     3
 3     0.7 0.955316618 5     3
 3     0     0     1     4
 3     0     0     2     4
 3     0     0     3     4
 3     0     0     4     4
 3     0.875 1.209429203 5     4
 4     0.294117647 0.573203309 1     1
 4     0.2   0.463647609 2     1
 4     0     0     3     1
 4     0     0     4     1
 4     0     0     5     1
 4     0     0     1     2
 4     0     0     2     2
 4     0     0     3     2
 4     0.08333 0.292842771 4     2
 4     0.1 0.339836909 5     2
 4     0     0     1     3
 4     0     0     2     3
 4     0     0     3     3
 4     0     0     4     3
 4     0.16667 0.420534336 5     3
 4     0     0     1     4
 4     0     0     2     4
 4     0.461538462 0.746898594 3     4
 4     0     0     4     4
 4     0.125 0.361367124 5     4
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] repeated measures ANOVA - among group differences

2009-04-01 Thread Mike Lawrence

Hm, it seems I possibly used the technical term nested
inappropriately in my response. I meant:

If Month is a repeated measure within each Quadrat...
and
If Treatment is also a repeated measure within each Quadrat...

On Wed, Apr 1, 2009 at 8:21 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 If Month is nested within Quadrat I think you want:
 aov(ProportioninTreatment ~ Treatment*Month +Error(Quadrat/Month), RM)

 If Treatment is also nested within Quadrat, you want:
 aov(ProportioninTreatment ~ Treatment*Month
 +Error(Quadrat/(Treatment*Month)), RM)


 On Wed, Apr 1, 2009 at 12:42 AM, Jessica L Hite/hitejl/O/VCU
 hit...@vcu.edu wrote:


 I have data on the proportion of clutches experiencing different fates
 (e.g., 4 different sources of mortality) for 5 months . I need to test 1)
 if the overall proportion of these different fates is different over the
 entire study and 2) to see if there are monthly differences within (and
 among) fate types. Thus, I am pretty sure this is an RM analysis -( I
 measure the same quadrats each month).

 I am fine running the analysis in R - with the code below, however, there
 is no output for the among group variation...this is an important component
 - any ideas on how to solve this problem?

 I have included code and sample data below.

 Many thanks in advance for help and suggestions.

 J

  both.aov - aov(ProportioninTreatment ~ factor(Treatment)*factor(Month) +
 Error(factor(Quadrat)), RM)

 Error: factor(id)
          Df  Sum Sq Mean Sq F value Pr(F)
 Residuals  3 0.51619 0.17206               #why only partial output
 here? ###

 Error: Within
                   Df Sum Sq Mean Sq F value   Pr(F)
 factor(Fate1)       3 1.2453  0.4151  3.5899 0.017907 *
 time                1 0.9324  0.9324  8.0637 0.005929 **
 factor(Fate1):time  3 0.9978  0.3326  2.8763 0.042272 *
 Residuals          69 7.9783  0.1156




 Fate1 Proportion in Fate      ASIN  Month Quadrat
 1     0.117647059 0.350105778 1     1
 1     0     0     2     1
 1     0.1 0.339836909 3     1
 1     0     0     4     1
 1     0     0     5     1
 1     0     0     1     2
 1     0     0     2     2
 1     0.2   0.463647609 3     2
 1     0.25  0.523598776 4     2
 1     0.1 0.339836909 5     2
 1     0     0     1     3
 1     0     0     2     3
 1     0     0     3     3
 1     0.384615385 0.668964075 4     3
 1     0     0     5     3
 1     0     0     1     4
 1     0     0     2     4
 1     0     0     3     4
 1     0.16667 0.420534336 4     4
 1     0     0     5     4
 2     0.352941176 0.636132062 1     1
 2     0.2   0.463647609 2     1
 2     0.3 0.615479708 3     1
 2     1     1.570796327 4     1
 2     0     0     5     1
 2     0.5   0.785398163 1     2
 2     0     0     2     2
 2     0.6   0.886077124 3     2
 2     0.41667 0.701674124 4     2
 2     0.2 0.490882678 5     2
 2     0     0     1     3
 2     0.2   0.463647609 2     3
 2     0     0     3     3
 2     0.461538462 0.746898594 4     3
 2     0     0     5     3
 2     0     0     1     4
 2     0     0     2     4
 2     0.307692308 0.588002604 3     4
 2     0.7 0.955316618 4     4
 2     0     0     5     4
 3     0     0     1     1
 3     0     0     2     1
 3     0.4 0.729727656 3     1
 3     0     0     4     1
 3     1     1.570796327 5     1
 3     0.5   0.785398163 1     2
 3     0     0     2     2
 3     0     0     3     2
 3     0.25  0.523598776 4     2
 3     0.6 0.841068671 5     2
 3     0     0     1     3
 3     0     0     2     3
 3     0     0     3     3
 3     0.153846154 0.403057075 4     3
 3     0.7 0.955316618 5     3
 3     0     0     1     4
 3     0     0     2     4
 3     0     0     3     4
 3     0     0     4     4
 3     0.875 1.209429203 5     4
 4     0.294117647 0.573203309 1     1
 4     0.2   0.463647609 2     1
 4     0     0     3     1
 4     0     0     4     1
 4     0     0     5     1
 4     0     0     1     2
 4     0     0     2     2
 4     0     0     3     2
 4     0.08333 0.292842771 4     2
 4     0.1 0.339836909 5     2
 4     0     0     1     3
 4     0     0     2     3
 4     0     0     3     3
 4     0     0     4     3
 4     0.16667 0.420534336 5     3
 4     0     0     1     4
 4     0     0     2     4
 4     0.461538462 0.746898594 3     4
 4     0     0     4     4
 4     0.125 0.361367124 5     4
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




-- 
Mike Lawrence

Re: [R] R Language Bloggers and Web Sites Owners

2009-03-31 Thread Mike Lawrence

Hm, this feels reminiscent of the Wellsphere free blog content scam:

http://neurocritic.blogspot.com/2009/02/wealthcentral.html




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help about: anova and population no normal

2009-03-31 Thread Mike Lawrence

Those with more formal statistical backgrounds may provide better
advice, but in my own informal training I've come to wonder why
parametric stats persist in the face of modern computing power. As I
understand it, Fisher developed ANOVA as low-computation method of
approximating the Randomization Test (a.k.a. exhaustive permutation
test). Where computation power has grown exponentially since Fisher's
time, these days it is feasible to compute the full R-Test in many
cases, and for those cases where the full R-Test  is not feasible,
non-exhaustive permutation variants usually satisfy. Indeed, it has
been shown (http://brm.psychonomic-journals.org/content/37/3/426.short)
that the R-Test is more powerful than the F-Test in the face of skewed
distributions.

My advice would thus be to abandon parametrics and simply code a
randomization test variant of the ANOVA you want.

Mike

On Tue, Mar 31, 2009 at 10:29 AM, Sarti Maurizio sart...@irea.cnr.it wrote:
 Dear R-Helpers,

 Parametric statistics are statistics where the population is assumed to fit 
 any
 parametrized distributions (most typically the normal distribution).
 My problem is:
 1) if my polulation is no normal
 2) if the sample data of all replications and treatments were well fitted from
 the Weibull distribution (shape and scale parameters).

 Can be The shape and scale parameters compared between treatments by using the
 canonical analysis of the variance ANOVA?

 Many thanks for your help with these questions.

 Maurizio

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help about: anova and population no normal

2009-03-31 Thread Mike Lawrence

Oh, and to answer your question more directly, the randomization test
permits testing hypotheses using any metric, so scale  shape are
definitely testable.

Typically one is interested in means, so on each iteration of the test
loop one computes the group/condition means. However, it's simple to
instead compute, say, the best fit weibull to each condition and
simultaneously test shift, shape, and scale.



On Tue, Mar 31, 2009 at 12:18 PM, Mike Lawrence mike.lawre...@dal.ca wrote:
 Those with more formal statistical backgrounds may provide better
 advice, but in my own informal training I've come to wonder why
 parametric stats persist in the face of modern computing power. As I
 understand it, Fisher developed ANOVA as low-computation method of
 approximating the Randomization Test (a.k.a. exhaustive permutation
 test). Where computation power has grown exponentially since Fisher's
 time, these days it is feasible to compute the full R-Test in many
 cases, and for those cases where the full R-Test  is not feasible,
 non-exhaustive permutation variants usually satisfy. Indeed, it has
 been shown (http://brm.psychonomic-journals.org/content/37/3/426.short)
 that the R-Test is more powerful than the F-Test in the face of skewed
 distributions.

 My advice would thus be to abandon parametrics and simply code a
 randomization test variant of the ANOVA you want.

 Mike

 On Tue, Mar 31, 2009 at 10:29 AM, Sarti Maurizio sart...@irea.cnr.it wrote:
 Dear R-Helpers,

 Parametric statistics are statistics where the population is assumed to fit 
 any
 parametrized distributions (most typically the normal distribution).
 My problem is:
 1) if my polulation is no normal
 2) if the sample data of all replications and treatments were well fitted 
 from
 the Weibull distribution (shape and scale parameters).

 Can be The shape and scale parameters compared between treatments by using 
 the
 canonical analysis of the variance ANOVA?

 Many thanks for your help with these questions.

 Maurizio

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to input multiple .txt files

2009-03-30 Thread Mike Lawrence

oops, didn't read the question fully. If you want to create 2 master files:

cust1_files = list.files(path=path_to_my_files,pattern='cust1',full.names=TRUE)
a=NULL
for(this_file in cust1_files){
   a=rbind(a,read.table(this_file))
}
write.table(a,'cust1.master.txt')

cust2_files = list.files(path=path_to_my_files,pattern='cust2',full.names=TRUE)
a=NULL
for(this_file in cust2_files){
   a=rbind(a,read.table(this_file))
}
write.table(a,'cust2.master.txt')


On Mon, Mar 30, 2009 at 8:55 AM, Mike Lawrence mike.lawre...@dal.ca wrote:
 my_files = list.files(path=path_to_my_files,pattern='.txt',full.names=TRUE)

 a=NULL
 for(this_file in my_files){
        a=rbind(a,read.table(this_file))
 }
 write.table(a,my_new_file_name)




 On Sun, Mar 29, 2009 at 10:37 PM, Qianfeng Li qflic...@yahoo.com wrote:


 how to input multiple .txt files?

 A data folder has lots of .txt files from different customers.

 Want to read all these .txt files to different master files:

 such as:

  cust1.xx.txt,  cust1.xxx.txt, cust1..txt,.. to master file: 
 X.txt

  cust2.xx.txt,  cust2.xxx.txt, cust2..txt,.. to master file: 
 Y.txt


 Thanks!



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University

 Looking to arrange a meeting? Check my public calendar:
 http://tinyurl.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to input multiple .txt files

2009-03-30 Thread Mike Lawrence

my_files = list.files(path=path_to_my_files,pattern='.txt',full.names=TRUE)

a=NULL
for(this_file in my_files){
a=rbind(a,read.table(this_file))
}
write.table(a,my_new_file_name)




On Sun, Mar 29, 2009 at 10:37 PM, Qianfeng Li qflic...@yahoo.com wrote:


 how to input multiple .txt files?

 A data folder has lots of .txt files from different customers.

 Want to read all these .txt files to different master files:

 such as:

  cust1.xx.txt,  cust1.xxx.txt, cust1..txt,.. to master file: 
 X.txt

  cust2.xx.txt,  cust2.xxx.txt, cust2..txt,.. to master file: 
 Y.txt


 Thanks!



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to input multiple .txt files

2009-03-30 Thread Mike Lawrence

To repent for my sins, I'll also suggest that Hadley Wickham's plyr
package (http://had.co.nz/plyr/) is also useful/parsimonious in this
context:

a - ldply(cust1_files,read.table)


On Mon, Mar 30, 2009 at 9:32 AM, baptiste auguie ba...@exeter.ac.uk wrote:
 may i suggest the following,


 a - do.call(rbind, lapply(cust1_files, read.table))

 (i believe expanding objects in a for loop belong to the R Inferno)

 baptiste

 On 30 Mar 2009, at 12:58, Mike Lawrence wrote:


 cust1_files =
 list.files(path=path_to_my_files,pattern='cust1',full.names=TRUE)
 a=NULL
 for(this_file in cust1_files){
      a=rbind(a,read.table(this_file))
 }
 write.table(a,'cust1.master.txt')

 _

 Baptiste Auguié

 School of Physics
 University of Exeter
 Stocker Road,
 Exeter, Devon,
 EX4 4QL, UK

 Phone: +44 1392 264187

 http://newton.ex.ac.uk/research/emag
 __





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loading and manipulating 10 data frames-simplified

2009-03-28 Thread Mike Lawrence

=cumsum(Bin10_Acres/Bin10Acres_sum)

 #Calculates the probability of choosing particular parcel from bin
 Bin10_parprob=abs(1-Bin10_cumper)

 #Combines parcel acreage data and cumlative percentage data
 Bin10Main.data = cbind(Bin10_Acres,Bin10_parprob,Bin10_TAZ,Bin10_TAZvacant)
 #--


 I have tried :

 Bin - lapply(1:10, function(.file){

 #Loads bin data frame from csv files with acres and TAZ data
 fileName
 -paste(I:/Research/Samba/urb_transport_modeling/LUSDR/Workspace/BizLandPrice/data/Bin_lookup_values/Bin,.file,
 _lookup.csv, sep=)
 Bin_main - read.csv(file=fileName,head=FALSE);

 #Separates Acres data from main data and converts acres to square feet
 Bin_Acres=Bin_main[[1]]*43560

 #Separates TAZ data from main data
 Bin_TAZ=Bin_main[[2]]

 #Separates TAZ data from main data and converts acres to square feet
 Bin_TAZvacant=Bin_main[[3]]*43560

 #Sums each parcel acreage data of the bin
 BinAcres_sum=sum(Bin_Acres)

 #Creates data frame of cumlative percentages of each parcel of bin
 Bin_cumper=cumsum(Bin_Acres/BinAcres_sum)

 #Calculates the probability of choosing particular parcel from bin
 Bin_parprob=abs(1-Bin_cumper)

 #Combines parcel acreage data and cumlative percentage data
 BinMain.data = cbind(Bin_Acres,Bin_parprob,Bin_TAZ,Bin_TAZvacant)
 })

 #-

 I am thinking that a simple for loop would work with a paste command but
 that appoach has yet to work either.  Thanks

 Cheers,
 JR
 --
 View this message in context: 
 http://www.nabble.com/loading-and-manipulating-10-data-frames-simplified-tp22731441p22731441.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Way to rotate a histogram?

2009-03-26 Thread Mike Lawrence

In case anyone is still interested, a slight improvement is to plot
both density and normal distributions on top of the empirical
histogram (previous version plotted only density):

library(ggplot2)
test_data-rnorm(100,mean=10,sd=4)
a = data.frame(obs = test_data,condition = 'None')
p1 = ggplot(
data = a
,aes(
x = obs
)
)+geom_histogram(
aes(
y = ..density..
)
)+stat_density(
mapping=aes(ymax=max(..density..))
,geom='path'
,colour='red'
)+stat_function(
fun = dnorm
,args = list(
m=mean(a$obs)
,sd=sd(a$obs)
)
,colour = 'green'
)+scale_x_continuous(
limits = range(a$obs)
)+opts(
panel.grid.minor = theme_blank()
,panel.grid.major = theme_blank()
,panel.background = theme_rect()
)+coord_flip(
)
p2 = ggplot(
data = a
,aes(
x = condition
,y = obs
)
)+geom_boxplot(
)+scale_y_continuous(
limits = range(a$obs)
)+scale_x_discrete(
name = ''
,labels = ''
)+opts(
panel.grid.minor = theme_blank()
,panel.grid.major = theme_blank()
,panel.background = theme_rect()
,axis.ticks = theme_blank()
,axis.text.y = theme_blank()
,axis.title.y = theme_blank()
)
p3 = ggplot(
data = a
,aes(
sample = (obs-mean(obs))/sd(obs)
)
)+stat_qq(
distribution=qnorm
)+geom_abline(
intercept=0
,slope=1
)+opts(
panel.grid.minor = theme_blank()
,panel.grid.major = theme_blank()
,panel.background = theme_rect()
,axis.ticks = theme_blank()
,axis.text.y = theme_blank()
,axis.title.y = theme_blank()
)


print(p1,vp = viewport(width = 1/3,height = 1,x = 1/3*.5,y = .5))
print(p2,vp = viewport(width = 1/3,height = 1,x = 1/3+1/3*.5,y = .5))
print(p3,vp = viewport(width = 1/3,height = 1,x = 2/3+1/3*.5,y = .5))


-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] geometric mean of probability density functions

2009-03-18 Thread Mike Lawrence

If you're interested in comparing empirical to simulation
distributions, I see two alternatives to your density() approach
(which will be sensitive to your choice of bandwidth). Both of the
following have been used in my field to look at the fit of empirical
response time data to models of human information processing:

(1) estimate some number of quantiles for each set of 1000 replicates,
average the quantile points, then compare the results to a similar set
of quantiles generated from your simulation data.

(2) fit an a priori distribution to each set of 1000 replicates,
within each distribution parameter, average across sets, then compare
to parameters obtained from fitting the simulation data.

#generate some fake simulation data
sim.x = rnorm(1e5,100,20)

#make up some fake empirical data
a=data.frame(
set = rep(1:8,each=1e3)
,x=rnorm(8*1e3,100,20)+rexp(8*1e3,50)
)

#define a function to compute the geometric mean
##(fails with negative numbers!)
geometric.mean = function(x) exp(mean(log(x)))



# Quantile approach


#set up a data frame to collect empirical quantile data
emp.q=expand.grid(
set=unique(a$set)
,q=seq(.1,.9,.1)
,x=NA
)

#estimate empirical quantiles
for(set in unique(a$set)){
emp.q$x[emp.q$set==set] = 
quantile(a$x[a$set==set],probs=unique(emp.q$q))
}

#compute the geomentric mean for each empirical quantile
emp.q.means = with(
emp.q
,aggregate(
x
,list(
q=q
)
,geometric.mean
)
)

#estimate the quantiles from the simulation data
sim.q = quantile(sim.x,unique(emp.q$q))

#assess the fit using sum squared scaled error
quantile.sum.sq.sc.err = sum(((emp.q.means$x-sim.q)/sim.q)^2)
print(quantile.sum.sq.sc.err)



# Fitting approach using the Generalized Lambda distribution
## GLD chosen because it's flexible and easily fit using the
### gld package


#install the gld package if necessary
if(!('gld' %in% installed.packages())){
install.packages('gld')
}

#load the gld package
library(gld)

#set up a data frame to collect empirical GLD parameters
emp.gld.par=expand.grid(
set=unique(a$set)
,lambda=1:4
,x=NA
)

#estimate empirical GLD parameters
## print-out keeps an eye on convergence (hopefully 0)
## takes a while to complete
for(set in unique(a$set)){
fit = starship(a$x[a$set==set])
cat('Set:',set,', convergence:',fit$optim.results$convergence,'\n')
emp.gld.par$x[emp.gld.par$set==set] = fit$lambda
}

#compute the geomentric mean for each empirical GLD parameter
emp.gld.par.means = with(
emp.gld.par
,aggregate(
x
,list(
lambda=lambda
)
,geometric.mean
)
)

#estimate the GLD parameters from the simulation data
sim.fit = starship(sim.x)
cat('Sim convergence:',sim.fit$optim.results$convergence,'\n')
sim.gld.par = sim.fit$lambda

#assess the fit using sum squared scaled error
gld.par.sum.sq.sc.err = sum(((emp.gld.par.means$x-sim.gld.par)/sim.gld.par)^2)
print(gld.par.sum.sq.sc.err)



-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Way to rotate a histogram?

2009-03-17 Thread Mike Lawrence

Using ggplot2:

library(ggplot2)
test_data-rnorm(100)
a=data.frame(obs=test_data,condition='None')
p1=qplot(
data=a
,x=obs
,geom='histogram'
)+coord_flip()
p2=qplot(
data=a
,y=obs
,x=condition
,geom='boxplot'
)+opts(
axis.text.y=theme_blank()
,axis.title.y=theme_blank()
)+coord_flip()
p3=qplot(
sample=test_data
,stat='qq'
,distribution=qnorm
)+coord_flip()

print(p1,vp=viewport(width=1,height=1/3,x=.5,y=1/3*.5))
print(p2,vp=viewport(width=1,height=1/3,x=.5,y=1/3+1/3*.5))
print(p3,vp=viewport(width=1,height=1/3,x=.5,y=2/3+1/3*.5))


On Tue, Mar 17, 2009 at 12:38 PM, Jason Rupert jasonkrup...@yahoo.com wrote:


 Here is what I have so far:
 test_data-rnorm(100)
 par(mfrow=c(1,3)) # 3 rows by 1 columns layout of plots
 hist(test_data)
 boxplot(test_data)
 qqnorm(test_data)

 I noticed that I can rotate a boxplot via horizontal, but apparently hist 
 does not have that functionality.

 I tried stacking the plots vertically:
 test_data-rnorm(100)
 par(mfrow=c(3,1)) # 3 rows by 1 columns layout of plots
 hist(test_data)
 boxplot(test_data, horizontal=TRUE)
 qqnorm(test_data)

 However, I would have to rotate the QQnorm plot, which would be pretty 
 confusing and I think non-standard.

 Thank you again for any feedback and insight regarding trying to reproduce 
 the JMP figure shown at:
 http://n2.nabble.com/Can-R-produce-this-plot--td2489288.html

 --- On Tue, 3/17/09, Jason Rupert jasonkrup...@yahoo.com wrote:

 From: Jason Rupert jasonkrup...@yahoo.com
 Subject: Re: [R] R package to automatically produce combination plot?
 To: R-help@r-project.org
 Date: Tuesday, March 17, 2009, 9:39 AM
 I guess no reply means there is not an existing package to
 produce the plot?

 I will post the results of my script to hopefully help
 others who are trying to formulate the same plot.

 Thanks again.


 --- On Mon, 3/16/09, Jason Rupert
 jasonkrup...@yahoo.com wrote:

  From: Jason Rupert jasonkrup...@yahoo.com
  Subject: [R] R package to automatically produce
 combination plot?
  To: R-help@r-project.org
  Date: Monday, March 16, 2009, 8:14 PM
  By any chance is there an R package that automatically
  produces the plot shown at the following link:
 
 http://n2.nabble.com/Can-R-produce-this-plot--td2489288.html
 
  That is an R package to produce on plot that has the
  following:
  (a) a vertically oriented histogram,
  (b) associated barplot, and
  (c) quantile-quantile plot (Q-Q Plot).
 
  This is based on a class lecture from University of
  Pennsylvania:
  stat.wharton.upenn.edu/~mcjon/stat-431/lecture-02.pdf
 
  I am pretty confident I can put one together, but just
  wanted to check that there does not already exist an R
  package to output such a plot.
 
  Thanks again.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Way to rotate a histogram?

2009-03-17 Thread Mike Lawrence

I actually pasted the wrong code! This attempts to replicate the
original request to replicate a JMP graphic:

library(ggplot2)
test_data-rnorm(100,mean=10,sd=4)
a = data.frame(obs = test_data,condition = 'None')
p1 = ggplot(
data = a
,aes(
x = obs
)
)+geom_histogram(
aes(
y = ..density..
)
)+geom_density(
)+scale_x_continuous(
limits = range(a$obs)
)+opts(
panel.grid.minor = theme_blank()
,panel.grid.major = theme_blank()
,panel.background = theme_rect()
)+coord_flip(
)
p2 = ggplot(
data = a
,aes(
x = condition
,y = obs
)
)+geom_boxplot(
)+scale_y_continuous(
limits = range(a$obs)
)+scale_x_discrete(
name = ''
,labels = ''
)+opts(
panel.grid.minor = theme_blank()
,panel.grid.major = theme_blank()
,panel.background = theme_rect()
,axis.ticks = theme_blank()
,axis.text.y = theme_blank()
,axis.title.y = theme_blank()
)
p3 = ggplot(
data = a
,aes(
sample = (obs-mean(obs))/sd(obs)
)
)+stat_qq(
distribution=qnorm
)+geom_abline(
intercept=0
,slope=1
)+opts(
panel.grid.minor = theme_blank()
,panel.grid.major = theme_blank()
,panel.background = theme_rect()
,axis.ticks = theme_blank()
,axis.text.y = theme_blank()
,axis.title.y = theme_blank()
)


print(p1,vp = viewport(width = 1/3,height = 1,x = 1/3*.5,y = .5))
print(p2,vp = viewport(width = 1/3,height = 1,x = 1/3+1/3*.5,y = .5))
print(p3,vp = viewport(width = 1/3,height = 1,x = 2/3+1/3*.5,y = .5))




On Tue, Mar 17, 2009 at 6:36 PM, David Winsemius dwinsem...@comcast.net wrote:
 Nice work, Mike. Actually I think he was looking for it done this way.

  library(ggplot2)
 test_data-rnorm(100)
 a=data.frame(obs=test_data,condition='None')
 p1=qplot(
        data=a
        ,x=obs
        ,geom='histogram'
        )+coord_flip()
 p2=qplot(
        data=a
        ,y=obs
        ,x=condition
        ,geom='boxplot'
        )+opts(
                axis.text.y=theme_blank()
                ,axis.title.y=theme_blank()
        )
 p3=qplot(
        sample=test_data
        ,stat='qq'
        ,distribution=qnorm
        )

 print(p1,vp=viewport(width=1/6,height=1,y=.5,x=1/6*.5))
 print(p2,vp=viewport(width=1/6,height=1,y=.5,x=1/6+1/6*.5))
 print(p3,vp=viewport(width=2/3,height=1,y=.5,x=1/3+2/3*.5))

 Perhaps with a red line through the hinge points. And probably need to get
 the axes aligned proerly but it's very close.

 --
 david Winsemius



 On Mar 17, 2009, at 5:14 PM, Mike Lawrence wrote:

 library(ggplot2)
 test_data-rnorm(100)
 a=data.frame(obs=test_data,condition='None')
 p1=qplot(
        data=a
        ,x=obs
        ,geom='histogram'
        )+coord_flip()
 p2=qplot(
        data=a
        ,y=obs
        ,x=condition
        ,geom='boxplot'
        )+opts(
                axis.text.y=theme_blank()
                ,axis.title.y=theme_blank()
        )+coord_flip()
 p3=qplot(
        sample=test_data
        ,stat='qq'
        ,distribution=qnorm
        )+coord_flip()

 print(p1,vp=viewport(width=1,height=1/3,x=.5,y=1/3*.5))
 print(p2,vp=viewport(width=1,height=1/3,x=.5,y=1/3+1/3*.5))
 print(p3,vp=viewport(width=1,height=1/3,x=.5,y=2/3+1/3*.5))

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to find maximum values on the density function of a random variable

2009-03-12 Thread Mike Lawrence

rv - rbinom(1,1,0.1) + rnorm(1)

d.rv = density(rv)
d.x = d.rv$x
d.y = d.rv$y

d.rv.max = d.rv$x[which.max(d.rv$y)]

plot(d.rv)
abline(v=d.rv.max)

#that what you want?

On Thu, Mar 12, 2009 at 6:28 PM,  g...@ucalgary.ca wrote:
 I would like to find the maximum values on the density function of a
 random variable. For example, I have a random variable

 rv - rbinom(1,1,0.1) + rnorm(1)

 Its density function is given by density(rv) and can be displayed by
 plot(density(rv)). How to calculate its maximum values?
 A density function may have a few (global and local) maximum values.
 Please help. Thanks,
 -james

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] OT: Likelihood ratio for the randomization/permutation test?

2009-03-11 Thread Mike Lawrence

Hi guRus,

My discipline (experimental psychology) is gradually moving away from
Null Hypothesis Testing and towards measures of evidence. One measure
of evidence that has been popular of late is the likelihood ratio.
Glover  Dixon (2005) demonstrate the calculation of the likelihood
ratio from ANOVA tables, but I'm also interested in non-parametric
statistics and wonder if anyone has any ideas on how to compute a
likelihood ratio from a randomization test (aka. permutation test)?

Say one had two groups and were interested in whether the mean scores
of the two groups differ in a manner consistent with random chance or
in a manner consistent with a non-null effect of some manipulation
applied to the two groups. The randomization test addresses this by
randomly re-assigning the participants to the groups, re-computing the
difference between means, and repeating many times, yielding a
distribution of simulated difference scores that represents the
distribution expected by chance.

Within a Null Hypothesis Testing framework you then estimate the
probability of the null by observing the proportion of simulated
difference scores that are greater in magnitude than the observed
difference score. Any guesses on how to translate this into a
quantification of evidence?

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Centering multi-line strip text in lattice

2009-03-11 Thread Mike Lawrence

Perfect, thanks!

On Wed, Mar 11, 2009 at 7:41 PM, Paul Murrell p.murr...@auckland.ac.nz wrote:
 Hi


 Mike Lawrence wrote:
 I'm having trouble centering multi-line strip text in lattice. As the
 code below demonstrates bounding box of the text is centered within
 the strip, but the first line isn't centered in relation to the longer
 second line. The adj argument to par.strip.text doesn't seem to do
 much. Suggestions?

 a=data.frame(
       x=rep(1:10,2)
       ,y=rep(1:10,2)
       ,z=rep(c('First Line\nLonger Second Line (1)','First Line\nLonger
 Second Line (2)'),each=10)
 )

 xyplot(
       y~x|z
       ,data=a
       ,par.strip.text = list(cex = .75, lineheight=1, lines = 2, adj=.5),
 )


 Here's one way, by writing your own strip function that calls the
 default strip function with blanked out labels then draws the labels
 directly with 'grid' calls.


 xyplot(
        y~x|z
        ,data=a
        ,par.strip.text = list(lines = 1.5),
        strip=function(which.panel, factor.levels, ...) {
                strip.default(which.panel=which.panel,
                              factor.levels=rep(,
                                                length(factor.levels)),
                              ...)
                pushViewport(viewport(clip=on))
                grid.text(factor.levels[which.panel],
                          gp=gpar(cex=.75, lineheight=1))
                popViewport()
            }
 )


 Paul
 --
 Dr Paul Murrell
 Department of Statistics
 The University of Auckland
 Private Bag 92019
 Auckland
 New Zealand
 64 9 3737599 x85392
 p...@stat.auckland.ac.nz
 http://www.stat.auckland.ac.nz/~paul/




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] find the important inputs to the neural network model in nnet package

2009-03-10 Thread Mike Lawrence

One thought is to train the net and obtain a performance measure on a
testing corpus. Next, for each input, run the testing corpus again,
but zero all values for that input and obtain a measure of
performance. Zeroing an important node will hurt performance more than
zeroing an unimportant node.

On Tue, Mar 10, 2009 at 9:41 AM, abbas tavassoli tavassoli...@yahoo.com wrote:

 Hi, I have a binary variable and many explanatory variables and I want to
 use the package nnet  to model these data, (instead of logistic regression).
 I want to find the more effective  variables (inputs to the network) in
 the neural network model. how can I do this?
 thanks.




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple question beginner

2009-03-10 Thread Mike Lawrence

names() is a great function for finding out how to get info from
objects of unknown structure, so try:
names(out)



2009/3/10 Λεωνίδας Μπαντής bleonida...@yahoo.gr:

 Hi there,

 I am beginner in R and I have some basic question. Suppose I run a common 
 procedure such as a t test or cox model like below:

 out-coxph( Surv(tstart,tstop, death1) ~ x1+x1:log(tstop+1) , 
 test1,method=c(breslow))

 Which yields the following result:

 Call:
 coxph(formula = Surv(tstart, tstop, death1) ~ x1 + x1:log(tstop +
     1), data = test1, method = c(breslow))

    coef exp(coef) se(coef) z    p
 x1    -9.58  6.89e-05 6.83 -1.40 0.16
 x1:log(tstop + 1)  6.90  9.93e+02 5.63  1.23 0.22
 Likelihood ratio test=2.97  on 2 df, p=0.226  n= 120


 Now I simply want to create an array (let a) with the coefficients. I.e. I 
 want

 a-c(-9.58, 6.90)

 Generally how can take the elements I want from the output matrix above for 
 further manipulation?

 Thanks in advance for any answer!!





 ___
 ×ñçóéìïðïéåßôå Yahoo!;
 ÂáñåèÞêáôå ôá åíï÷ëçôéêÜ ìçíýìáôá (spam); Ôï Yahoo! Mail
 äéáèÝôåé ôçí êáëýôåñç äõíáôÞ ðñïóôáóßá êáôÜ ôùí åíï÷ëçôéêþí
 ìçíõìÜôùí http://login.yahoo.com/config/mail?.intl=gr

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing factor to numeric

2009-03-10 Thread Mike Lawrence

Try:

as.numeric(as.character(x))

I usually define the following for this purpose:
factor.to.number=function(x){
as.numeric(as.character(x))
}


On Tue, Mar 10, 2009 at 2:25 AM, ojal john owino
ojal.johnow...@googlemail.com wrote:
 Dear Users,
 I have a variable in my dataset which is of type factor. But it actually
 contains numeric entries which like 5.735  4.759 . This is because the
 data was read from a CSV file into R and this variable contained other
 charaters which were not numeric. I have now dropped the records with the
 characters which are not numeric for this variable and want to change it to
 numeric srotage type.

 I have tried using as.numeric() function but it changes the values in the
 variable to what I think are the ranks of the individual values of the
 varible in the dataset. For example if 5.735 is the current content in the
 field, then the new object created by as.numeric will contain a value like
 680 if the 5.735 was the highest value for the varible and the dataset had
 680 records.


 How can I change the storage type without changing the contents of this
 variable in this case?

 Thanks for your consideration.



 --
 Ojal John Owino
 P.O Box 230-80108
 Kilifi, Kenya.
 Mobile:+254 728 095 710

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Centering multi-line strip text in lattice

2009-03-10 Thread Mike Lawrence

I'm having trouble centering multi-line strip text in lattice. As the
code below demonstrates bounding box of the text is centered within
the strip, but the first line isn't centered in relation to the longer
second line. The adj argument to par.strip.text doesn't seem to do
much. Suggestions?

a=data.frame(
x=rep(1:10,2)
,y=rep(1:10,2)
,z=rep(c('First Line\nLonger Second Line (1)','First Line\nLonger
Second Line (2)'),each=10)
)

xyplot(
y~x|z
,data=a
,par.strip.text = list(cex = .75, lineheight=1, lines = 2, adj=.5),
)



-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to estimate distribution?

2009-03-09 Thread Mike Lawrence

I know it only causes a shift and scale transformation of the
distribution, but it always gets my goat when I see people using the
sum as a statistic of interest when the mean has a more general
application (i.e. some other researcher obtains a different number of
observations per subject) and a more meaningful interpretation (i.e.
the mean is the answer you'd expect, on average, from the participant
on any given question; the sum is ... ?)

An acceptable method of fitting empirical data to theoretical
distributions is maximum likelihood estimation. Sometimes there are
analytic solutions, sometimes you have to use a search algorithm like
optimize() (for 1 parameter distributions) or optim() (for multiple
parameter distributions) to find the solution.

Say you hypothesized that the red data came from an exponential
distribution, you would define a function that obtains the sum log
likelihood of the data given a specific rate parameter, then search a
reasonable range of rate parameters for that which maximuzes the sum
log liklihood:

exp.sll = function(rate,x){
log.lik = dexp(x,rate,log=TRUE)
sum.log.lik = sum(log.lik)
print(c(rate,sum.log.lik))
return(sum.log.lik)
}

a=rexp(150,rate=10)
hist(a,probability=TRUE)

best.rate = optimize(
exp.sll
,x=a
,lower=0
,upper=1/mean(a)*10
,maximum=TRUE
)$maximum

curve(dexp(x,rate=best.rate),col='red',add=T)


On Wed, Mar 4, 2009 at 12:01 PM, Simone Gabbriellini
simone.gabbriell...@gmail.com wrote:
 Dear R-Experts,

 I have an empirical dataset with 150 subjects for 24 observations.
 In each observation, each subject can have a score in the range 0:3.

 I made then a simple index making the sum of the values in each row,
 so each subject have a score between 0 and 72.

 I was thinking about what kind of theoretical distribution such an
 index should have, so I try to make things random like:

 data-array(0,c(150,24))
 data2-array(0, c(150, 100))
 for (prova in 1:100){
        for (i in 1:24){
                for (q in 1:150){
                        data[q,i]-sample(0:3, 1)
                        }
                }
        for (riga in 1:150){
                data2[riga,prova]-sum(data[riga,])
                }
        }

 now here you can find the plotted theoretical values (black) against
 the empirical ones (red):

 http://www.digitaldust.it/papers/indice.png

 How can I estimate the density of both distribution? because the red
 one looks like a pareto distribution, but I am not an expert...

 many thanks,
 Simone

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] info

2009-03-09 Thread Mike Lawrence

http://lmgtfy.com/?q=Parallel+R

First entry provides an article that notes the various solutions thus
far (rmpi, snow, pnmath).


On Wed, Mar 4, 2009 at 7:06 AM, Enrico Giorgi giorgi_enr...@hotmail.it wrote:

 Dear Sir or Madam,

 I've been using R for one year and I really appreciate it.
 I would like to know if a version performing parallel computations on 
 multicore computers and computer clusters exists.

 Thank you very much.

 Yours sincerely.

 Enrico Giorgi


 _


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Computing sd across an array with missing values

2009-02-26 Thread Mike Lawrence

It just so happens that I created a vectorized SD function the other
day. See the column and row versions below. If there are any
rows/columns with only one non-NA value, it will return NaN.

col_sd = function(x){
dimx = dim(x)
x.mean = .Internal(colMeans(x,dimx[1],dimx[2],na.rm=TRUE))
err = t(t(x)-x.mean)
err.sq = err*err
sum.err.sq = .Internal(colSums(err.sq,dimx[1],dimx[2],na.rm=TRUE))
n = .Internal(colSums(!is.na(x),dimx[1],dimx[2],na.rm=TRUE))
sqrt(sum.err.sq/(n-1))
}

row_sd = function(x){
dimx = dim(x)
x.mean = .Internal(rowMeans(x,dimx[1],dimx[2],na.rm=TRUE))
err = x-x.mean
err.sq = err*err
sum.err.sq = .Internal(rowSums(err.sq,dimx[1],dimx[2],na.rm=TRUE))
n = .Internal(rowSums(!is.na(x),dimx[1],dimx[2],na.rm=TRUE))
sqrt(sum.err.sq/(n-1))
}


On Wed, Feb 25, 2009 at 5:11 PM, Jorge Ivan Velez
jorgeivanve...@gmail.com wrote:
 Dear Matt,
 You're absolutely right. The reason why my code gives different results is
 because it calculates the standard deviation for each row/column in all the
 arrays rather than for every cell. My bad.

 You can easily get the results you posted running

apply(p,1:2,sd,na.rm=TRUE)

 Here is another option which gives the same results you posted :-)
  Unfortunately there is not timing improvement :(   (Note that I ran both
 sd1 and sd2 functions using pp instead of p)

 # -
 # System times
 # -
 pp - array(c(1:5, rep(NA, times = 3)), dim = c(1000, 1000, 60))
 system.time(apply(pp, 1:2, sd1))
 #   user  system elapsed
 #  93.16    0.23   94.87
 system.time(apply(pp, 1:2, sd2))
 #   user  system elapsed
 #  66.51    0.30   67.84

 # --
 # Functions
 # --
 sd1-function(x) sd(x,na.rm=TRUE)

 sd2- function(i){
 if(sum(!is.na(i))==0) temp.sd - NA
 else temp.sd - sd(i, na.rm = TRUE)
 temp.sd
 }


 Regards,

 Jorge


 On Wed, Feb 25, 2009 at 3:23 PM, Matt Oliver moli...@udel.edu wrote:

 Hi Jorge, this does not seem to return the same thing as

 p - array(c(1:5, rep(NA, times = 3)), dim = c(5, 5, 3))
 sd_fun - function(i){
 if(sum(!is.na(i))==0){
 temp.sd - NA
 }else{
 temp.sd - sd(i, na.rm = TRUE)
 }
 return(temp.sd)
 }

 apply(p, 1:2, sd_fun)

 Am I missing something basic here?



 On Wed, Feb 25, 2009 at 11:47 AM, Jorge Ivan Velez 
 jorgeivanve...@gmail.com wrote:


 Hi Mark,
 There is a typo in the first way. My apologies. It should be:
 # First
 res-apply(p,3,function(X)
        c(scols=apply(X,2,sd,na.rm=TRUE),srows=apply(X,1,sd,na.rm=TRUE))
        )
 res

 HTH,

 Jorge


 On Wed, Feb 25, 2009 at 11:42 AM, Jorge Ivan Velez 
 jorgeivanve...@gmail.com wrote:


 Dear Matt,

 Here are two ways:

 # Data
 p - array(c(1:5, rep(NA, times = 3)), dim = c(5, 5, 3))

 # First
 res-apply(p,3,function(X)
        c(scols=apply(X,2,sd,na.rm=TRUE),srows=apply(X,2,sd,na.rm=TRUE))
        )
 res

 # Second
 res2-apply(p,3,function(X)

 list(scols=apply(X,2,sd,na.rm=TRUE),srows=apply(X,1,sd,na.rm=TRUE))
        )

 lapply(res2,function(x) do.call(rbind,x))

 HTH,

 Jorge


 On Wed, Feb 25, 2009 at 10:53 AM, Matt Oliver moli...@udel.edu wrote:

 Dear help, suppose I have this array and want to compute sd aross rows
 and
 columns.

 p - array(c(1:5, rep(NA, times = 3)), dim = c(5, 5, 3))

 apply(p, 1:2, sd) fails because sd requires at least 2 numbers to
 compute sd

 apply(p, 1:2, sd, na.rm = TRUE) fails for the same reason

 I crafted my own function that does what I want

 sd_fun - function(i){
 if(sum(!is.na(i))==0){
 temp.sd - NA
 }else{
 temp.sd - sd(i, na.rm = TRUE)
 }
 return(temp.sd)
 }


 apply(p, 1:2, sd_fun)

 This does what I want, but when I scale up to large arrays like

 pp - array(c(1:5, rep(NA, times = 3)), dim = c(1000, 1000, 60))

 the apply function takes a long time to run.

 Is there a faster, more efficient way to do this?

 Thanks in advance

 Matt


 --
 Matthew J. Oliver
 Assistant Professor
 College of Marine and Earth Studies
 University of Delaware
 700 Pilottown Rd.
 Lewes, DE, 19958
 302-645-4079
 http://www.ocean.udel.edu/people/profile.aspx?moliver

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






 --
 Matthew J. Oliver
 Assistant Professor
 College of Marine and Earth Studies
 University of Delaware
 700 Pilottown Rd.
 Lewes, DE, 19958
 302-645-4079
 http://www.ocean.udel.edu/people/profile.aspx?moliver



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike

Re: [R] colored maps again

2009-02-19 Thread Mike Lawrence

Here's a bit of a hacky way to do it:


#get the names of each state
state=map('state',plot=F)$names

#set up some random state-color data
cols = 
as.data.frame(cbind(state=states,x=sample(1:10,length(states),replace=T)))

#do the plot
map('usa')
for(this_state in state){

map('state',region=this_state,add=T,fill=T,col=cols$x[cols$state==this_state])
}



On Thu, Feb 19, 2009 at 12:45 PM, Alina Sheyman alina...@gmail.com wrote:
 I'm trying to create a colored map that would show the number of students
 per state.
 My data frame consists of two columns - state and count.
 I'm using the following code

 library(maps)
 map(usa)
 library(plotrix)
 state.col-color.scale(gre$count,0,0,c(0,1))
  map(state,fill=TRUE,col=state.col)

 I'm getting a map, but the values are not being mapped to correct states.
 What do I need to do to fix that?

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error bars

2009-02-18 Thread Mike Lawrence

Check out ggplot2:
http://had.co.nz/ggplot2/

Particularly:
http://had.co.nz/ggplot2/geom_errorbar.html

If you need more help you'll have to provide a self-contained set of data/code.

Hopefully you'll soon be completely rid of this strange excel of
which you speak :Op

On Wed, Feb 18, 2009 at 9:38 PM, Nicole Hackman
nicole.hack...@gmail.com wrote:
 Hello, I have a very simple data set i imported from excel including 96
 averages in a column along with 96 standard errors associated with those
 averages (calculated in excel).  I plotted the 95 averages using r and I am
 wondering if it is possible to plot the second column of standard errors
 while applying error bars to each value so they represent the error
 corresponding to each average?
 thanks,
 Nicole

 --
 Nicole Hackman
 Master of Science Student
 University of Washington
 College of Forest Resources
 Box 352100
 n...@u.washington.edu

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] anova help

2009-02-15 Thread Mike Lawrence

Provide some example data  code and someone may be able to help.

On Sat, Feb 14, 2009 at 2:46 PM, Joe King j...@joepking.com wrote:
 Hi all, I am trying to run a two factor anova, but one of the factors is a
 random factor, now I am also running in SPSS and it seems its dividing by
 the wrong term to get the appropriate F term. here is my data. In SPSS the F
 scores about double the ones in R, how can I specify one of my factors as a
 random factor or change it to where it does the right model fitting? I am
 using the lm command instead of glm. I am new to R so this might seem basic.



 Joe King, M.A.
  mailto:j...@joepking.com j...@joepking.com

 Never give in, never give in, never; never; never; never - in nothing,
 great or small, large or petty - never give in except to convictions of
 honor and good sense - Winston Churchill



 You have enemies? Good. That means you've stood up for something, sometime
 in your life. - Winston Churchill




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mixed ANCOVA with between-Ss covariate?

2009-02-12 Thread Mike Lawrence

I'm surprised there were no takers on this query; I thought it would
be an easy answer, particularly where I provided example data set and
code. Did my request run afoul of the list etiquette?

On Tue, Feb 10, 2009 at 5:12 PM, Mike Lawrence m...@thatmike.com wrote:
 Hi all,

 I have data from an experiment with 3 independent variables, 2 are
 within and 1 is between. In addition to the dependent variable, I have
 a covariate that is a single measure per subject. Below I provide an
 example generated data set and my approach to implementing the ANCOVA.
 However the output confuses me; why does the covariate only appear in
 the first strata? Presumably it should appear in every strata in which
 there can be between-Ss effects or interactions with between-Ss
 effects, no?

 #generate data
 set.seed(1)
 a=rbind(
expand.grid(
id=1:20
,iv1 = 1:2
,iv2 = 1:2
)
 )
 a$group = factor(a$id11)
 a$dv = rnorm(length(a[,1]))
 a$covariate = NA
 for(i in unique(a$id)){
a$covariate[a$id==i]= rnorm(1)
 }

 #make sure id, iv1, and iv2 are factorized
 a$id=factor(a$id)
 a$iv1=factor(a$iv1)
 a$iv2=factor(a$iv2)

 #run ANCOVA
 covariate_aov = aov(dv~covariate+group*iv1*iv2+Error(id/(iv1*iv2)),data=a)
 summary(covariate_aov)


 --
 Mike Lawrence
 Graduate Student
 Department of Psychology
 Dalhousie University
 www.thatmike.com

 Looking to arrange a meeting? Check my public calendar:
 http://www.thatmike.com/mikes-public-calendar

 ~ Certainty is folly... I think. ~




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Mixed ANCOVA with between-Ss covariate?

2009-02-10 Thread Mike Lawrence

Hi all,

I have data from an experiment with 3 independent variables, 2 are
within and 1 is between. In addition to the dependent variable, I have
a covariate that is a single measure per subject. Below I provide an
example generated data set and my approach to implementing the ANCOVA.
However the output confuses me; why does the covariate only appear in
the first strata? Presumably it should appear in every strata in which
there can be between-Ss effects or interactions with between-Ss
effects, no?

#generate data
set.seed(1)
a=rbind(
expand.grid(
id=1:20
,iv1 = 1:2
,iv2 = 1:2
)
)
a$group = factor(a$id11)
a$dv = rnorm(length(a[,1]))
a$covariate = NA
for(i in unique(a$id)){
a$covariate[a$id==i]= rnorm(1)
}

#make sure id, iv1, and iv2 are factorized
a$id=factor(a$id)
a$iv1=factor(a$iv1)
a$iv2=factor(a$iv2)

#run ANCOVA
covariate_aov = aov(dv~covariate+group*iv1*iv2+Error(id/(iv1*iv2)),data=a)
summary(covariate_aov)


-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lme() direction

2009-02-07 Thread Mike Lawrence

Hi guRus,

I'm looking for advice on a good way to approach analysis of some
multi-level data I've obtained. I had humans classify words and
measured response time. Words were 10 positive words (happy, joy,
etc) and 10 negative words (sad,grumpy, etc). Words were also
presented in either white, red or green color. All variables were
manipulated within-Ss and for each word-color combination I collected
10 observations.

So the data would be something like:

set.seed(1)
a=rbind(
cbind(
type='positive'
,expand.grid(
id=1:10
,color=c('white','red','green')
,word=c('happy','joy')
,repetition = 1:10
)
)
,cbind(
type='negative'
,expand.grid(
id=1:10
,color=c('white','red','green')
,word=c('sad','grumpy')
,repetition = 1:10
)
)
)

#add some fake rt data
a$rt=rnorm(length(a[,1]))

#And because people make errors sometimes:
a$error = rbinom(length(a[,1]),1,.1)

#remove error trials because they're not psychologically interesting:
a=a[a$error==0,]


I'm most interested in the interaction between color and type, but I
know that there is likely an effect of word. Yet since word is not
completely crossed with type, simply adding it to an aov() won't work.
A colleague recommended I look into lme() but so far I can't figure
out the proper call.

Another issue is whether to collapse across repetition before running
the stats, particularly since errors will leave unequal numbers of
observations per cell if it's left in.

Any advice anyone can provide would be great!

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lme() direction

2009-02-07 Thread Mike Lawrence

Would it improve things if type were a continuous variable rather
than categorical? I chose words at the extreme ends of a valence
rating scale but I still have the raw valence ratings for each word.

On Sat, Feb 7, 2009 at 12:02 PM, Dieter Menne
dieter.me...@menne-biomed.de wrote:
 Mike Lawrence mike at thatmike.com writes:

 Thanks for the excellent reproducible sample set!

 I'm most interested in the interaction between color and type, but I
 know that there is likely an effect of word. Yet since word is not
 completely crossed with type, simply adding it to an aov() won't work.
 A colleague recommended I look into lme() but so far I can't figure
 out the proper call.

 Without word, it would be

 summary(lme(rt~type*color, data=a,random=~1|id))

 With the interaction, the extreme would be
 summary(lme(rt~type*color*word, data=a,random=~1|id))

 or, less extreme

 summary(lme(rt~type*color+color:word, data=a,random=~1|id))

 but all these fail because of the rather degenerate structure
 of you data set. While lmer in package lme4 allows for a wider
 set of solutions, I currently do not see how it could help,
 but I might be wrong with p=0.5.


   word happy joy sad grumpy
 type color
 positive white 93  90   0  0
 red   90  88   0  0
 green 88  87   0  0
 negative white  0   0  88 95
 red0   0  91 85
 green  0   0  88 88


 Another issue is whether to collapse across repetition before running
 the stats, particularly since errors will leave unequal numbers of
 observations per cell if it's left in.

 That's one of the points where you have little to bother with the lme
 approach. Collapsing would give equal weights to unequal numbers of
 repeat, and might of minor importance when not too extreme, though.

 Dieter


 set.seed(1)
 a=rbind(
cbind(
type='positive'
,expand.grid(
id=1:10
,color=c('white','red','green')
,word=c('happy','joy')
,repetition = 1:10
)
)
,cbind(
type='negative'
,expand.grid(
id=1:10
,color=c('white','red','green')
,word=c('sad','grumpy')
,repetition = 1:10
)
)
 )

 #add some fake rt data
 a$rt=rnorm(length(a[,1]))

 #And because people make errors sometimes:
 a$error = rbinom(length(a[,1]),1,.1)

 #remove error trials because they're not psychologically interesting:
 a=a[a$error==0,]

 library(nlme)
 ftable(a[,c(1,3,4)])
 summary(lme(rt~type*color, data=a,random=~1|id))

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lme() direction

2009-02-07 Thread Mike Lawrence

And if I decided to ignore the type variable altogether and simply
use the continuous valence variable, this is what I'd use?

summary(lme(
fixed = rt~valence*color
, data = a
,random = ~1|id
))

I also have continuous luminance measurements of each color that vary
from participant to participant (we used different monitors). If I
were interested in luminance *instead* of color, would the following
be appropriate, or do I need to do something special to account for
the fact that each participant has different values of luminance?

summary(lme(
fixed = rt~valence*luminance
, data = a
,random = ~1|id
))


On Sat, Feb 7, 2009 at 12:34 PM, Dieter Menne
dieter.me...@menne-biomed.de wrote:
 Mike Lawrence mike at thatmike.com writes:


Would it improve things if type were a continuous variable rather
than categorical? I chose words at the extreme ends of a valence
rating scale but I still have the raw valence ratings for each word.

 
  With the interaction, the extreme would be
  summary(lme(rt~type*color*word, data=a,random=~1|id))
 
  or, less extreme
 
  summary(lme(rt~type*color+color:word, data=a,random=~1|id))
 ..
 Something like

 summary(lme(rt~type*color+color:as.numeric(word), data=a,random=~1|id))

 (please replace as.numeric() by the raw valence, the example above it
 simply wrong)

 could gain you a few degrees of freedom if you are willing to accept the
 linear hypothesis. And as there is something like raw valence, one should
 not throw away details about a-priori ordering in favor of a categorical
 hypothesis.

 Dieter

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems in Recommending R

2009-02-03 Thread Mike Lawrence

One of my colleagues is a interdisciplinary PhD in Design and
Psychology and he has an in with a design school where we might be
able to get students to take on the redesign of the website.

He asks:
In order to ensure efficient consumption of resources and maximize
our return on investment, please provide potential designers with a
direct point of contact (name, email, telephone number) so that they
may request a project description and feedback.

Obviously the redesign idea has been generated in a community thread,
but if anyone from the R foundation can step up as such a contact
person I will forward your info to my colleague who will then take the
temperature of students at the design school.

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems in Recommending R

2009-02-03 Thread Mike Lawrence

One of my colleagues is a interdisciplinary PhD in Design and
Psychology and he has an in with a design school where we might be
able to get students to take on the redesign of the website.

He asks:
In order to ensure efficient consumption of resources and maximize
our return on investment, please provide potential designers with a
direct point of contact (name, email, telephone number) so that they
may request a project description and feedback.

Obviously the redesign idea has been generated in a community thread,
but if anyone from the R foundation can step up as such a contact
person I will forward your info to my colleague who will then take the
temperature of students at the design school.

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] From z to Rho

2009-01-30 Thread Mike Lawrence

Do these functions help?

#Fisher's r-to-z:
fr2z - function(r) atanh(r)

#Fisher's z-to-r:
fz2r - function(z) tanh(z)

On Fri, Jan 30, 2009 at 9:29 AM, LE PAPE Gilles lepape.gil...@neuf.fr wrote:
 Hi,
 when performing a spearman_test stratified by a given factor in package 
 coin, how is it possible to obtain the value of Rho, the Spearman 
 correlation coefficient ?
 Thanks in advance
 Gilles (F)

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with time series

2009-01-30 Thread Mike Lawrence

To seperate the columns, use the sep argument in read.table()

mystage - read.table(C:\\Documents and
Settings\\skfriedman\\Desktop\\R-scripts\\stage.txt, header =
TRUE,sep=',')

On Fri, Jan 30, 2009 at 4:17 PM,  steve_fried...@nps.gov wrote:

 Hello everyone


 I'm working with R 2.8.1 on a windows  machine

 I have a question regarding time series analysis

 The first question is how does R expect the input file to be structured?
 I'm working with a *.txt file similar to the abbreviated one here:

 Date,stage
 4/2/1953,7.56
 4/3/1953,7.56
 4/4/1953,7.54
 4/5/1953,7.53
 4/6/1953,7.5
 4/7/1953,7.47
 4/8/1953,7.44
 4/9/1953,7.41
 4/10/1953,7.37
 4/11/1953,7.33
 4/12/1953,7.3
 4/13/1953,7.26
 4/14/1953,7.28
 4/15/1953,7.28
 4/16/1953,7.23
 4/17/1953,7.47
 4/18/1953,7.59
 4/19/1953,7.58
 4/20/1953,7.57
 4/21/1953,7.56
 4/22/1953,7.55
 4/23/1953,7.53
 4/24/1953,7.51
 4/25/1953,7.48
 4/26/1953,7.46
 4/27/1953,7.5
 4/28/1953,7.56

 The data record is substantially longer - 50 years worth of daily
 hydrologic water stage data  (column 2).

 R seems to get confused by the format of this input not knowing what to do
 with the date field, and also deciding to treat everything as a level.

 I'm reading the data as follows:

 mystage - read.table(C:\\Documents and
 Settings\\skfriedman\\Desktop\\R-scripts\\stage.txt, header = TRUE)

 looking at the data  I get the following:

  mystage[1:4,]
 [1] 4/2/1953,7.56 4/3/1953,7.56 4/4/1953,7.54 4/5/1953,7.53
 20195 Levels: 1/1/1954,8.72 1/1/1955,8.48 1/1/1956,7.94 1/1/1957,7.88
 1/1/1958,8.5 ... 9/9/2007,8.84


 What I'd like is a time series  with a starting data of April 21, 1953.
 ending December 30, 2008.  data are daily records, so the frequency should
 be 365 (?) not counting leap year nuisances.

 So the first question is how should I build the input file to correctly
 import it to a time series with an odd beginning date?

 The analysis I'm really trying to get to will involve calculating the mean
 monthly stage, the mean seasonal (aggregated over several months) stage,
 the annual maximum period with a continuous stage greater than 0.

 Thanks in advance I will summary solutions.

 Much appreciated

 Steve

 Steve Friedman Ph. D.
 Spatial Statistical Analyst
 Everglades and Dry Tortugas National Park
 950 N Krome Ave (3rd Floor)
 Homestead, Florida 33034

 steve_fried...@nps.gov
 Office (305) 224 - 4282
 Fax (305) 224 - 4147

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with normal distribution in random samples...

2009-01-28 Thread Mike Lawrence

?rnorm

On Wed, Jan 28, 2009 at 4:04 PM, Sea Captain 1779 dbo...@measinc.com wrote:

 Hi!!!

 First time 'R' user looking for a little assistance.  Here is what I have so
 far:

 practice1 = matrix ((runif(5000, min =0, max = 12)), 100)

 which is creating 50 samples, for 100 cases, distributed between 0-12.  What
 I would like is to be able to set the mean and SD so that the data is
 normally distributed around lets say 7.  Any help I can get with achieving
 that goal would be greatly appreciated!!!

 -Dan
 --
 View this message in context: 
 http://www.nabble.com/Help-with-normal-distribution-in-random-samples...-tp21713636p21713636.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mode (statistics) in R?

2009-01-26 Thread Mike Lawrence

Here's a rather convoluted way of finding the mode (or, at least, the
first mode):

x = round(rnorm(100,sd=5))
my_mode = as.numeric(names(table(x))[which.max(table(x))])




On Mon, Jan 26, 2009 at 9:28 AM, Jason Rupert jasonkrup...@yahoo.com wrote:
 Hopefully this is a pretty simple question:

 Is there a function in R that calculates the mode of a sample?   That is, I 
 would like to be able to determine the value that occurs the most frequently 
 in a data set.

 I tried the default R mode function, but it appears to provide a storage 
 type or something else.

 I tried the RSeek and some R documentation that I downloaded, but nothing 
 seems to mention calculating the mode.

 Thanks again.





[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for Computational Neuroscience?

2009-01-25 Thread Mike Lawrence

Thanks to Ken and Bernardo for their attempts to answer my question,
but I was apparently unclear as to what I meant by computational
neuroscience.

The tools Ken and Bernardo suggest provide means to analyze data from
neuroscience research, but I'm actually looking for means to simulate
biologically realistic neural systems.

Maybe it would help if I provided some keywords, so here is a list of
chapters/sections from MATLAB section of the book How the brain
computes: Network fundamentals of computational neuroscience by
Thomas Trappenberg:

12 A MATLAB guide to computational neuroscience
12.1 Introduction to the MATLAB programming environ-
ment
...
12.2 Spiking neurons and numerical integration in MAT-
LAB
12.2.1 Integrating Hodgkin-Huxley equations with the
Euler method
12.2.2 The Wilson model and advanced integration
12.2.3 MATLAB function ﬁles
12.2.4 Leaky integrate-and-ﬁre neuron
12.2.5 Poisson spike trains
12.2.6 Netlet formulas by Anninos et al.
12.3 Associators and Hebbian Learning
12.3.1 Hebbian Weight matrix in rate models
12.3.2 Hebbian learning with weight decay
12.4 Recurrent networks and networks dynamics
12.4.1 Example of a complete network simulation
12.4.2 Quasi-continuous attractor network
12.4.3 Networks with random asymmetric weight ma-
trix
12.4.4 The Lorenz attractor
12.5 Continuous attractor neural networks
12.5.1 Path-integration
12.6 Error-backpropagation network

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 133 matches

Mail list logo