Re: Regression with repeated measures

2001-03-01 Thread Donald Burrill

On Wed, 28 Feb 2001, Mike Granaas wrote in part (and 2 paragraphs of 
descriptive prose quoted at the end):

 ...  is there some method that will allow him to get the prediction
 equation he wants?

Probably the best approach is the multilevel (aka hierarchical) modelling 
advocated by previous respondents.  Possible problems with that approach: 
(1) you'll need purpose-built software, which may not be conveniently 
available at USD;  (2) the user is usually required (as I rather vaguely 
recall from a brush with Goldstein's ML3 a decade ago) to specify which 
(co)variances are to be estimated in the model, both within and between 
levels, and if your student isn't up to this degree of technical skill, 
(s)he may not have a clue as to what the output will be trying to say. 

For a conceptually simpler, if less rigorous, approach, the problem could 
be addressed as an analysis of covariance (to use the now old-fashioned 
language), using the intended predictor as the covariate and the 10 (or 
whatever number of) trials for each S as a blocking variable (as in 
randomized blocks in ANOVA).  This would at least bleed off (so to write) 
some of the excess number of degrees of freedom;  especially if one also 
modelled interaction between predictor and blocking variable (which might 
well require a GLM program, rather than an ANCOVA program), as in testing 
homogeneity of regression.  The blocking variable itself might be 
interpretable (if one were interested) as an (idiosyncratic?) amalgam of 
practice/learning and fatigue.
-- Don.
 --
 Donald F. Burrill[EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 (603) 535-2597
 Department of Mathematics, Boston University[EMAIL PROTECTED]
 111 Cummington Street, room 261, Boston, MA 02215   (617) 353-5288
 184 Nashua Road, Bedford, NH 03110  (603) 471-7128
 --

The situation as Mike desribed it:

 I have a student coming in later to talk about a regression problem.
 Based on what he's told me so far he is going to be using predicting
 inter-response intervals to predict inter-stimulus intervals (or vice
 versa).
 
 What bothers me is that he will be collecting data from multiple trials
 for each subject and then treating the trials as independent replicates.
 That is, assuming 10 trials/S and 10 S he will act as if he has 100
 independent data points for calculating a bivariate regression.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: probability definition

2001-03-01 Thread Alex Yu


For a quick walk through of various prob. theories, you may consult "The 
Cambridge Dictionary of Philosophy." pp.649-651. 

Basically, propensity theory is to deal with the problem that frequentist
prob. cannot be applied to a single case. Propensity theory defines prob.
as the disposition of a given kind of physical situation to yield an
outcome of a given type. 

The following is extracted from one of my papers. It brielfy talks about 
the history of classical theory, Reichenbach's frequentism and Fisherian 
school:


Fisherian hypothesis testing is based upon relative frequency in long 
run. Since a version of the frequentist view of probability was developed 
by positivists Reichenbach (1938) and von Mises (1964), the two schools 
of thoughts seem to share a common thread. However, it is not necessarily 
true. Both Fisherian and positivist's frequency theory were proposed as 
an opposition to the classical Laplacean theory of probability. In the 
Laplacean perspective, probability is deductive, theoretical, and 
subjective. To be specific, this probability is subjectively deduced from 
theoretical principles and assumptions in the absence of objective 
verification with empirical data. Assume that every member of a set has 
equal probability to occur (the principle of indifference), probability 
is treated as a ratio between the desired event and all possible events. 
This probability, derived from the fairness assumption, is made before 
any events occur. 

Positivists such as Reichenbach and von Mises maintained that a very 
large number of empirical outcomes should be observed to form a reference 
class. Probability is the ratio between the frequency of desired outcome 
and the reference class. Indeed, the empirical probability hardly concurs 
with the theoretical probability. For example, when a dice is thrown, in 
theory the probability of the occurrence of number "one" should be 1/6. 
But even in a million simulations, the actual probability of the 
occurrence of "one" is not exactly one out of six times. It appears that 
positivist's frequency theory is more valid than the classical one. 
However, the usefulness of this actual, finite, relative frequency theory 
is limited for it is difficult to tell how large the reference class is 
considered large enough. 

Fisher (1930) criticized that Laplace's theory is subjective and 
incompatible with the inductive nature of science. However, unlike the 
positivists' empirical based theory, Fisher's is a hypothetical infinite 
relative frequency theory. In the Fisherian school, various theoretical 
sampling distributions are constructed as references for comparing the 
observed. Since Fisher did not mention Reichenbach or von Mises, it is 
reasonable to believe that Fisher developed his frequency theory 
independently. Backed by a thorough historical research, Hacking (1990) 
asserted that "to identify frequency theories with the rise of positivism 
(and thereby badmouth frequencies, since "positivism" has become 
distasteful) is to forget why frequentism arose when it did, namely when 
there are a lot of known frequencies." (p.452) In a similar vein, Jones 
(1999) maintained that "while a positivist may have to be a frequentist, 
a frequentist does not have to be a positivist."


Chong-ho (Alex) Yu, Ph.D., MCSE, CNE
Academic Research Professional/Manager
Educational Data Communication, Assessment, Research and Evaluation
Farmer 418
Arizona State University
Tempe AZ 85287-0611
Email: [EMAIL PROTECTED]
URL:http://seamonkey.ed.asu.edu/~alex/
   
  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression with repeated measures

2001-03-01 Thread Thom Baguley

Steve Gregorich wrote:
 
 Linear mixed models (aka
 multilelvel models, random
 coefficient models, etc) as
 implemented by many software
 products: SAS PROC MIXED,
 MIXREG, MLwiN, HLM, etc.
 
 You might want to look at some
 links on my website
 
 http://sites.netscape.net/segregorich/index.html

There are a few good intros available (some - like Goldstein and Hox's books
also on the web):

Goldstein, H. (1995). Multilevel statistical models. London: Arnold.
Hox, J. J. (1995). Applied multilevel analysis. Amsterdam: TT-Publikaties.
Paterson, L.,  Goldstein, H. (1991). New statistical methods for analyzing
social structures: an introduction to multilevel models. British Educational
Research Journal, 17, 387-393.
Snijders, T. A. B.,  Bosker, R. J. (1999). Multilevel analysis: an
introduction to basic and advanced multilevel modeling. London: Sage.

The Snijders  Bosker is a very good intro. Kreft  de Leeuw also published an
intro text (though I haven't read it yet).

Thom


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Cronbach's alpha and sample size

2001-03-01 Thread Nicolas Sander


Thank you all for the helping answers. 
I had the problem of obtaining negative Alphas, when some subjects where
excluded from analyses (three out of ten). When they were included, I
had alphas of .65 to .75 (N items =60). The problem is - as I suspect -
that the average interitem correlation is very low and drops below zero
when these subjects were excluded. 
It might interest you, that I'm used to negative correlations in the
correlation matrix because I work with difference scores of reation time
measures (so there is no directional coding problem). Lots of repeated
measures ensure high consistency despite low average inter item
correlations and despite some negative correlations between individual
measures.

Nico
--


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression with repeated measures

2001-03-01 Thread Thom Baguley

Donald Burrill wrote:
 Probably the best approach is the multilevel (aka hierarchical) modelling
 advocated by previous respondents.  Possible problems with that approach:
 (1) you'll need purpose-built software, which may not be conveniently
 available at USD;  (2) the user is usually required (as I rather vaguely

Very good point.

 recall from a brush with Goldstein's ML3 a decade ago) to specify which
 (co)variances are to be estimated in the model, both within and between
 levels, and if your student isn't up to this degree of technical skill,
 (s)he may not have a clue as to what the output will be trying to say.

MlWiN is much easier to use (though does require good knowledge of standard
GLM regression equations). The default is just to model the variance at each
level, though adding in variance parameters is very easy. I'd love to have a
standard GLM program with the same interface (adding, deleting terms from a
visual representation of the regression equation).

I agree that in lots of cases multilevel modeling may be the "ideal" choice
but not sensible in practice (sample size considerations or for some teaching 
contexts).

For some problems, a multilevel model is not required at all. By treating
repeated obs as independent N is inflated. It may be sufficient (depending on
what effects you want to estimate) just to correct N to reflect this design
effect. Snijders and Bosker's book is pretty lucid on this (pp16-24).

Thom


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



comparing multiple correlated correlations

2001-03-01 Thread Allyson Rosen

OK here's another question from a newbie.  In this small sample of 14
subjects, I wanted to compare several correlated correlations: individual's
brain volumes correlated with a measure of memory performance.
Specifically, I wanted to say that 1 correlation is stronger than the other
3.  There's lots out there on just comparing 2 correlations but I wanted to
compare all 4 at once.

The most appropriate article I found was by Olkin and Finn
 I. Olkin and J. Finn. Testing correlated correlations. Psychological
Bulletin 108(2):330-333, 1990.

The problem is that they assume huge sample sizes.  I consulted with a
statistician and she suggested a jack knife procedure in which I set up the
following comparison:
r1-average(r2,r3,r4)
I iteratively remove each subject and calculate this comparison and the
difference of that output from the total group comparison.
i.e. r1-average(r2,r3,r4) WITHOUT subject 1 included, r1-average(r2,r3,r4)
without subject 2 included... and generate the difference of each of these
scores from the total scores.
Finally, I generate a confidence interval.  If that confidence interval does
not include zero, then the comparison is significant.
It worked and now I want to cite an appropriate source in the paper.  Is
there a good reference on similar jack knife procedures?  I found this in
the spss appendix.

 M. H. Quenouville. Approximate tests of correlation in time series. Journal
of the Royal Statistical Society, Series B 11:68, 1949

Many thanks,

Allyson




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



ANN: Book: Causation, Prediction, and Search

2001-03-01 Thread wolfskil

I thought readers of sci.stat.edu might be interested in this book.  For
more information please visit http://mitpress.mit.edu/promotions/books/SPICHF00.

Best,
Jud

Causation, Prediction, and Search
second edition
Peter Spirtes, Clark Glymour, and Richard Scheines

What assumptions and methods allow us to turn observations into causal
knowledge, and how can even incomplete causal knowledge be used in
planning and prediction to influence and control our environment? In
this book Peter Spirtes, Clark Glymour, and Richard Scheines address
these questions using the formalism of Bayes networks, with results that
have been applied in diverse areas of research in the social,
behavioral, and physical sciences.

The authors show that although experimental and observational study
designs may not always permit the same inferences, they are subject to
uniform principles. They axiomatize the connection between causal
structure and probabilistic independence, explore several varieties of
causal indistinguishability, formulate a theory of manipulation, and
develop asymptotically reliable procedures for searching over
equivalence classes of causal models, including models of categorical
data and structural equation models with and without latent variables.

The authors show that the relationship between causality and probability
can also help to clarify such diverse topics in statistics as the
comparative power of experimentation versus observation, Simpson's
paradox, errors in regression models, retrospective versus prospective
sampling, and variable selection.

The second edition contains a new introduction and an extensive survey
of advances and applications that have appeared since the first edition
was published in 1993. 

Peter Spirtes is Professor of Philosophy at the Center for Automated
Learning and Discovery, Carnegie Mellon University. Clark Glymour is
Alumni University Professor of Philosophy at Carnegie Mellon University
and Valtz Family Professor of Philosophy at the University of
California, San Diego. He is also Distinguished External Member of the
Center for Human and Machine Cognition at the University of West
Florida, and Adjunct Professor of Philosophy of History and Philosophy
of Science at the University of Pittsburgh. Richard Scheines is
Associate Professor of Philosophy at the Center for Automated Learning
and Discovery, and at the Human Computer Interaction Institute, Carnegie
Mellon University.

7 x 9, 496 pp., 225 illus., cloth ISBN 0-262-19440-6
Adaptive Computation and Machine Learning series
A Bradford Book


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=