# Re: [MORPHMET] A question regarding "target shape"

On 05/11/2018 18:50, Diego Ardón wrote:
Thank you Mr. Fruciano. I had already made the DFA, but wasn't aware the graphical output represented both groups (it certainly makes sense). I have a couple of other questions regarding semi-landmarks. I probably should start a new topic, but I'll first try out here:
So, I was adviced to use semi-landmarks, I placed them with MakeFan8, saved the files as images and then used TpsDig to place all landmarks, however I didn't make any distinctions between landmarks and semi-landmarks. What unsettles me is (1) that I've recently comed across the term "sliding semi-landmarks", which leads me to believe semi-landmarks should behave in a particular way.
Well, it's a long topic, but the general idea is that, to account for the uncertainty in placement of a semilandmark along a curve, this is slid along the curve itself (or, more frequently, its approximation) so that ideally only variation perpendicular to the curve (reflecting the curvature) is retained. In current practice, semilandmarks are slid. Various software can do this, the most popular for 2D data being certainly tpsRelW by F.J. Rohlf.
Gunz & Mitteroecker 2013. Semilandmarks: a method for quantifying curves and surfaces. Hystrix
The second thing that unsettles me is whether "more semi-landmarks" means a better analysis.
Not necessarily.

I can understand that most people wouldn't use 65 landmarks+semilandmarks because it's a painstaking job to digitize them, however, in my recent reads I've comed across concepts like a "Variables to specimen ratio", which one paper suggested specimens should be 5 times the number of variables. I do have a a data set of nearly 400 specimens, but it does come short if indeed I should have 65*2*5 specimens!
There are two issues: 1. whether statistical procedures are defined, 2. whether one has enough power and/or how large is error in estimates.
The first issue is easy to deal with: certain statistical procedures (for instance, the ones involving matrix inversion) are not defined if there are many variables and relatively few cases. These procedures simply "don't work". However, there are other alternative procedures which do work (e.g., the ones based on distances) and/or workarounds (e.g., use of generalized inverses).
The second issue is much more complex and I doubt one can give a straightforward answer. In general, the more observations (specimens) the better (when one can get them, that is). But the idea of a certain number of observations relative to the number of variables is, at best, a rule of thumb.
Clearly, having too many variables can create problems and artifacts. An interesting recent example of this can be found in
Bookstein 2016 - A newly noticed formula enforces fundamental limits of geometric morphometric analyses. Evolutionary Biology
In your particular case, if I were you I would ask myself whether all those points/semilandmarks are that necessary to capture biologically relevant variation. That is a question that only you can answer, based on your knowledge of the biological problem at hand.
Statistical power and reliability of estimates is another issue, which is in part dataset-dependent (as well as dependent on which statistical procedures you intend to use). An interesting paper dealing with this is
Cardini 2007. Sample size and sampling error in geometric morphometric studies of size and shape. Zoomorphology
In general, as said above, it's very hard to give straightforward answers to your question.
```I hope this still helps, though.
Dear Diego,
MorphoJ will actually do it. The easiest is to use what is under the
menu "Discriminant analysis". MorphoJ's user guide has a brief but very
clear description of the graphical output.
I hope this helps.
Best,
Carmelo

