Re: [MORPHMET] Morphometrics and quadratic discriminant function question

K. James Soda Mon, 06 Jul 2015 02:02:02 -0700

Dear Eimear,

I don't think I am going to be able to answer all your questions, but I
think I can get the ball rolling and help refine for what you need to look;
hopefully someone will be able to elaborate on my explanation and get you
all the way there.

First, I need to point out two core ideas from multivariate statistics (and
I apologize if you already know this).  First, many techniques in
multivariate statistics stem from the idea that you can take variables
present in your data and combine them into "new" variables that have some
desirable properties.  Usually this involves taking a weighted sum of the
variable values, so in more mathematical parlance we say that the new
variables are linear combinations.  A key point is that these linear
combinations are still random variables and, as a result, we can visualize
the linear combinations in the same way we visualize any multivariate data
set.  Second, although multidimensional, a random, multivariate data point
originates, by definition, from a probability distribution, and we can make
inferences about the distribution.  For example, we can draw confidence
regions, which will contain a population parameter, often the mean vector,
with a known probability.

In the graphic you mention, the statistical software JMP first transformed
the sample space into a space of discriminate classifiers.  Then it
estimated where the population mean for each species is located in the
discriminate space, along with regions composed of other probable
locations.  If an infinite number of these regions were drawn, 95% of them
would contain the true population mean.  What I do not know, however, is
HOW it transformed the original sample space into the discriminate space.
In LDA, this process is relatively straight-forward.  It is geometrically
equivalent to scaling each variable so that they are uncorrelated and have
unit variance and then performing a PCA on the group means. This process is
possible because every group has the same covariance structure.  This is
not the case in QDA, and the descriptions that I have read generally do not
describe QDA in terms of finding linear combinations or applying some sort
of transformation.

In my (admittedly young) opinion, how you present the results of your QDA
depends on what you want to say about the results.  If you want to
establish a rule for differentiating otoliths from different populations,
then presenting the results of a leave-one-out cross validation (or even
more ideally, classification accuracies from a test set) may be
sufficient.  In contrast, if you want to say that the populations are
different from each other, it gets a little bit trickier because you have
to address what you mean when you say different.  Often discriminate
analysis can be pulled into this question because one could argue that
populations are different if samples in those populations can be classified
to the correct group, and thus can be distinguished from each other. This
is a legitimate argument, certainly, but it is important to remember, in my
opinion, that this was not the goal of the individuals who created these
methods, so you need to be cautious about how you interpret the results.
Take a look at this paper for more thoughts on the process:

Kovarovic, K., Aiello, L. C., Cardini, A., & Lockwood, C. A. (2011).
Discriminant function analyses in archaeology: are classification rates too
good to be true?. *Journal of Archaeological Science*, *38*(11), 3006-3018.

Getting back to your issue, though, in this latter case, it may make more
sense to provide an ordination as well, so long as you understand what the
ordination is actually showing and whether what it is showing is relevant
to your argument; I am afraid I cannot lend advice on that point without
knowing how the sample space was transformed.

Finally, in response to your question 3, if I understand what you looking
at properly, these two plots are describing two different things.  The plot
of the discriminate axes is showing you a transformed sample space and
where your data resides in this sample space.  In contrast, the confidence
ellipses estimates where the mean vector for each population is in the
transformed sample space and regions where the mean vector is also likely
to be located (see above).

Hope this helps (sorry it was so long),

James Soda

On Sun, Jun 28, 2015 at 1:51 AM, Eimear Egan <eimearmariee...@gmail.com>
wrote:

> Hi all.
>
> I am examining otolith morphology using elliptical Fourier analysis and
> shape indices. I am using the MASS package in R to run a quadratic
> discriminant function analysis and I have gotten myself in a bit of a tizzy!
>
> I am a confused about how to present the results from the QDF. Most people
> state the Jack-knife classifications in a table but I have seen one paper
> that has plotted the centroid and 95% confidence ellipse (see link)
>
> http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0034481
>
> 1. I don't understand what this plot represents. Is it the posterior
> probabilities and if so is it correct to represent the data like this?
>
> 2. I have trawled through google and came across a link with code for the
> iris dataset. The author also plots the centroid and 95% confidence ellipse
> and this is of the posterior probabilities (but uses linear DA).
>
> 3. I have ran a linear DA on my data just to get to grips with
> discriminant function in R. Ordination of the two discriminant axes (I have
> three groups) versus plotting the posterior probabilities and confidence
> ellipses produce very different results...... as expected I guess. Are both
> methods correct but just different ways of visualizing the data? And how
> does this apply in the context of QDF
>
> Any insights would be greatly appreciated.
>
> Thanks,
>
> Eimear
>
>
> *Eimear Marie Ceileadh Egan*
> *PhD Candidate*
> Marine Ecology Research Group.
> University of Canterbury,
> School of Biological Sciences,
> Private Bag 4800, Christchurch 8140, New Zealand
> *Telephone* +64 3 364 2987
>
>
>
>
>  --
> MORPHMET may be accessed via its webpage at http://www.morphometrics.org
>
> To unsubscribe from this group and stop receiving emails from it, send an
> email to morphmet+unsubscr...@morphometrics.org.
>

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org

To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org.

Re: [MORPHMET] Morphometrics and quadratic discriminant function question

Reply via email to