[MORPHMET] Re: Are more semi landmarks better??

2018-11-06 Thread Diego Ardón
Thanks to everyone who have replied, I sure have a lot of reading to do, 
but overall I feel more comfortable about my data. I might end up playing 
around with removing some semi-landmarks, figuring that it shouldn't affect 
in much the outcome. I'll get back in case I find any other doubts.

Thanks again

El lunes, 5 de noviembre de 2018, 11:52:57 (UTC-6), Diego Ardón escribió:
>
> Good day everybody, I actually have two questions here regarding 
> semi-landmarks:
>
> So, I was adviced to use semi-landmarks, I placed them with MakeFan8, 
> saved the files as images and then used TpsDig to place all landmarks, 
> however I didn't make any distinctions between landmarks and 
> semi-landmarks. What unsettles me is (1) that I've recently comed across 
> the term "sliding semi-landmarks", which leads me to believe semi-landmarks 
> should behave in a particular way. 
>
> The second thing that unsettles me is whether "more semi-landmarks" means 
> a better analysis. I can understand that most people wouldn't use 65 
> landmarks+semilandmarks because it's a painstaking job to digitize them, 
> however, in my recent reads I've comed across concepts like a "Variables to 
> specimen ratio", which one paper suggested specimens should be 5 times the 
> number of variables. I do have a a data set of nearly 400 specimens, but it 
> does come short if indeed I should have 65*2*5 specimens!
>
> Please, I'll appreciate some feedback :)
>

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
--- 
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org.


Re: [MORPHMET] Re: semilandmarks in biology

2018-11-06 Thread N. MacLeod
Agreed. In addition, I think it’s important to note that, in the original 
implementations of the sliding algorithm, semilandmarks were slid not along the 
curve itself, but along tangents to the curve (= off the boundary outline). How 
much distortion this induces is, of course, a function of how much the 
semilandmarks are displaced from their original positions. However, it’s always 
seemed problematic to me that, after sliding, you end up with shapes that have 
been distorted to a greater or lesser extent. Of course, if the displacement is 
small the amount off distortion will (likely) be small and the results might 
not be all that different. Moreover, as Phillip notes,  in terms of many types 
of analyses, linear data transformations make no difference to the outcome of 
an analysis.  But given these facts, the point of sliding the semilandmarks at 
all seems questionable in many contexts. Moreover, in the case of complex 
boundary outline curves - in other words, the curves semilandmarks are usually 
called upon to quantify - since the magnitude of the slide is, to a large 
extent, determined by the density of the semilandmark placements, large 
semilandmark displacements will never occur. So, if you have a curve that is so 
smooth it only needs a few semilandmarks to tie down, you run the risk of 
generating some (presently unspecified) degree of distortion in your data by 
sliding the semilandmarks so long as the sliding takes place along tangents. 
But if your curve is complex it’s unlikely that sliding the semilandmarks will 
make much difference because the distance along which sliding can take place is 
constrained. Sliding semilandmarks is an interesting strategy in principle. But 
in many cases the (current) practice is fraught with problems that are rarely 
acknowledged.

Norm MacLeod



> On 6 Nov 2018, at 19:53, mitte...@univie.ac.at wrote:
> 
> Yes, it was always well known that sliding adds covariance but this is 
> irrelevant for most studies, especially for group mean comparisons and shape 
> regressions: the kind of studies for which GMM is most efficient, as Jim 
> noted. 
> If you consider the change of variance-covariance structure due to (a small 
> amount of) sliding as an approximately linear transformation, then the 
> sliding is also largely irrelevant for CVA, relative PCA, Mahalanobis 
> distance and the resulting group classifications, as they are all based on 
> the relative eigenvalues of two covariance matrices and thus unaffected by 
> linear transformations. In other words, in the lack of a reasonable 
> biological null model, the interpretation of a single covariance structure is 
> very difficult, but the way in which one covariance structure deviates from 
> another can be interpreted much easier. 
> 
> Concerning your example: The point is that there is no useful model of 
> "totally random data" (but see Bookstein 2015 Evol Biol). Complete 
> statistical independence of shape coordinates is geometrically impossible and 
> biologically absurd. Under which biological (null) model can two parts of a 
> body, especially two traits on a single skeletal element such as the cranium, 
> be complete uncorrelated?  
> 
> Clearly, semilandmarks are not always necessary, but making "cool pictures" 
> can be quite important in its own right for making good biology, especially 
> in exploratory settings. Isn't the visualization one of the primary strengths 
> of geometric morphometrics?
> 
> It is perhaps also worth noting that one can avoid a good deal of the 
> additional covariance resulting from sliding. Sliding via minimizing bending 
> energy introduces covariance in the position of the semilandmarks _along_ the 
> curve/surface. In some of his analyses, Fred Bookstein just included the 
> coordinate perpendicular to the curve/surface for the semilandmarks, thus 
> discarding a large part of the covariance. Note also that sliding via 
> minimizing Procrustes distance introduces only little covariance among 
> semilandmarks because Procrustes distance is minimized independently for each 
> semilandmark (but the homology function implied here is biologically not so 
> appealing). 
> 
> Best,
> 
> Philipp
> 
> 
> 
> Am Dienstag, 6. November 2018 18:34:51 UTC+1 schrieb alcardini:
> Yes, but doesn't that also add more covariance that wasn't there in 
> the first place? 
> Neither least squares nor minimum bending energy, that we minimize for 
> sliding, are biological models: they will reduce variance but will do 
> it in ways that are totally biologically arbitrary. 
> 
> In the examples I showed sliding led to the appearance of patterns 
> from totally random data and that effect was much stronger than 
> without sliding. 
> I neither advocate sliding or not sliding. Semilandmarks are different 
> from landmarks and more is not necessarily better. There are 
> definitely some applications where I find them very useful but many 
> more where they seem to be there just to 

Re: [MORPHMET] Re: semilandmarks in biology

2018-11-06 Thread mitte...@univie.ac.at
Yes, it was always well known that sliding adds covariance but this is 
irrelevant for most studies, especially for group mean comparisons and 
shape regressions: the kind of studies for which GMM is most efficient, as 
Jim noted. 
If you consider the change of variance-covariance structure due to (a small 
amount of) sliding as an approximately linear transformation, then the 
sliding is also largely irrelevant for CVA, relative PCA, Mahalanobis 
distance and the resulting group classifications, as they are all based on 
the relative eigenvalues of two covariance matrices and thus unaffected by 
linear transformations. In other words, in the lack of a reasonable 
biological null model, the interpretation of a single covariance structure 
is very difficult, but the way in which one covariance structure deviates 
from another can be interpreted much easier. 

Concerning your example: The point is that there is no useful model of 
"totally random data" (but see Bookstein 2015 Evol Biol). Complete 
statistical independence of shape coordinates is geometrically impossible 
and biologically absurd. Under which biological (null) model can two parts 
of a body, especially two traits on a single skeletal element such as the 
cranium, be complete uncorrelated?  

Clearly, semilandmarks are not always necessary, but making "cool pictures" 
can be quite important in its own right for making good biology, especially 
in exploratory settings. Isn't the visualization one of the primary 
strengths of geometric morphometrics?

It is perhaps also worth noting that one can avoid a good deal of the 
additional covariance resulting from sliding. Sliding via minimizing 
bending energy introduces covariance in the position of the semilandmarks 
_along_ the curve/surface. In some of his analyses, Fred Bookstein just 
included the coordinate perpendicular to the curve/surface for the 
semilandmarks, thus discarding a large part of the covariance. Note also 
that sliding via minimizing Procrustes distance introduces only little 
covariance among semilandmarks because Procrustes distance is minimized 
independently for each semilandmark (but the homology function implied here 
is biologically not so appealing). 

Best,

Philipp



Am Dienstag, 6. November 2018 18:34:51 UTC+1 schrieb alcardini:
>
> Yes, but doesn't that also add more covariance that wasn't there in 
> the first place? 
> Neither least squares nor minimum bending energy, that we minimize for 
> sliding, are biological models: they will reduce variance but will do 
> it in ways that are totally biologically arbitrary. 
>
> In the examples I showed sliding led to the appearance of patterns 
> from totally random data and that effect was much stronger than 
> without sliding. 
> I neither advocate sliding or not sliding. Semilandmarks are different 
> from landmarks and more is not necessarily better. There are 
> definitely some applications where I find them very useful but many 
> more where they seem to be there just to make cool pictures. 
>
> As Mike said, we've already had this discussion. Besides different 
> views on what to measure and why, at that time I hadn't appreciated 
> the problem with p/n and the potential strength of the patterns 
> introduced by the covariance created by the superimposition (plus 
> sliding!). 
>
> Cheers 
>
> Andrea 
>
> On 06/11/2018, F. James Rohlf > 
> wrote: 
> > I agree with Philipp but I would like to add that the way I think about 
> the 
> > justification for the sliding of semilandmarks is that if one were smart 
> > enough to know exactly where the most meaningful locations are along 
> some 
> > curve then one should just place the points along the curve and 
> > computationally treat them as fixed landmarks. However, if their exact 
> > positions are to some extend arbitrary (usually the case) although still 
> > along a defined curve then sliding makes sense to me as it minimizes the 
> > apparent differences among specimens (the sliding minimizes your measure 
> of 
> > how much specimens differ from each other or, usually, the mean shape. 
> > 
> > 
> > 
> > _ _ _ _ _ _ _ _ _ 
> > 
> > F. James Rohlf, Distinguished Prof. Emeritus 
> > 
> > 
> > 
> > Depts. of Anthropology and of Ecology & Evolution 
> > 
> > 
> > 
> > 
> > 
> > From: mitt...@univie.ac.at   > 
> > Sent: Tuesday, November 6, 2018 9:09 AM 
> > To: MORPHMET > 
> > Subject: [MORPHMET] Re: semilandmarks in biology 
> > 
> > 
> > 
> > I agree only in part. 
> > 
> > 
> > 
> > Whether or not semilandmarks "really are needed" may be hard to say 
> > beforehand. If the signal is known well enough before the study, even a 
> > single linear distance or distance ratio may suffice. In fact, most 
> > geometric morphometric studies are characterized by an oversampling of 
> > (anatomical) landmarks as an exploratory strategy: it allows for 
> unexpected 
> > findings (and nice visualizations). 
> > 
> > 
> > 
> > Furthermore, there is a fundamental difference 

Re: [MORPHMET] Re: semilandmarks in biology

2018-11-06 Thread alcardini
Indeed one of my favourite examples where semilandmarks are really
useful is a paper by Hublin, Gunz et al. (with apologies for the
inaccurate ref. and mixed up order of authors) where they manage to
classify as Neanderthal a piece of cranial vault found (I believe) in
Belgium and possibly in the sea. With all the limits we mentioned,
it's probably hard to do any better.
Certainly it's not the only type of useful application but that kind
of 'forensic' analysis, where the main aim is pure classification
accuracy, is where I see (potentially) less problems. However, even
with phenetics (in the original sense of the term), putting together
all sorts of different characters and characters states, regardless of
their evolutionary significance, one could probably get a very good
and stable classification. Yet, to what extent that would be
biologically meaningful is hard to say, and we abandoned phenetics for
cladistics.
With shape data, as implied by Jim's comment to my message, we don't
yet have the same kind of understanding to be able to do the same
(some kind of 'biologically meaningful superimposition).
For now, in a way, it seems to me that it's as if we were aligning
sequences using, say, a least square method, which molecular
biologists would never use because they know much more about DNA
evolution and can model that more accurately in the alignment. We
can't. Thus, for me, semilandmarks are useful if (as in your example)
they may provide information which is relevant and there's no other
way to get. The limits will still be there but the pros may be more
than the cons. Where I disagree is the general trend to believe that
more is always better.

The field is different but I am very much sympathetic with what
Hawkins said in his review of spatial data analysis:
"This has led to a confusing
literature and a proliferation of increasingly complicated
analytical methods that are difficult to evaluate or even
understand if you are not a statistician. A tendency to ignore
assumptions of many of these complex methods does not help
matters. It has also diverted attention away from epistemo-
logical and conceptual issues of importance to our field, some
of which I have tried to highlight. Although it is self-evident
that statistics are an indispensible tool for evaluating data,
when we focus too much on methods it is natural to add new
layers of complexity as our view becomes narrower and
narrower and we try to capture every nuance of our data. But
biogeography and geographical ecology are not branches of
theoretical statistics, and there does come a point at which
analytical complexity begins to interfere with understanding."
Journal of Biogeography (J. Biogeogr.) (2012) 39, 1–9

Good night

Andrea



On 06/11/2018, Mike Collyer  wrote:
> Andrea,
>
> I am intrigued by your initial comment about adding covariance that was
> apparently absent.  I tend to think of the problem from the other
> perspective of not accounting for covariance that should be present.  As a
> thought experiment (that could probably be simulated, and maybe I am not
> correct in my thinking), I like to think of two landmark configurations that
> are the same in all regards except for one curve, where two groups have
> distinctly different curves but maybe would not be obviously distinctively
> different if an insufficient number of semi-landmarks (or none) were used to
> characterize the curve.  If one were to (maybe simulate this example and)
> use one sparse representation of landmarks and one dense representation,
> perform a cross-validation classification analysis, and calculate posterior
> classification probabilities (let’s assume equal sample sizes and,
> therefore, equal prior probabilities), I would expect that the posterior
> probabilities of the dense landmark configuration would better assign
> specimens to the appropriate process that generated them (i.e., their
> correct groups).  The posterior probabilities would be closer to 0 and 1
> because of the “added covariance”, as reflected by the squared generalized
> Mahalanobis distances, based on the pooled within-group covariance.  The
> added covariance would be essential for the posterior probabilities, if the
> sparse configurations produced similar generalized distances to group means,
> and therefore, similar posterior probabilities for classification.
>
> I’m not sure adding covariance is an issue.  To me it simply changes the
> hypothetical (null) covariance structure, which Philipp mentioned should
> probably not be assumed to be independent (isotropic).  I think your example
> might best highlight that a different multivariate normal distribution of
> residuals is to be expected with a different configuration.
>
> Cheers!
> Mike
>
>
>> On Nov 6, 2018, at 12:34 PM, alcardini  wrote:
>>
>> Yes, but doesn't that also add more covariance that wasn't there in
>> the first place?
>> Neither least squares nor minimum bending energy, that we minimize for
>> sliding, are 

Re: [MORPHMET] Re: semilandmarks in biology

2018-11-06 Thread Mike Collyer
Andrea,

I am intrigued by your initial comment about adding covariance that was 
apparently absent.  I tend to think of the problem from the other perspective 
of not accounting for covariance that should be present.  As a thought 
experiment (that could probably be simulated, and maybe I am not correct in my 
thinking), I like to think of two landmark configurations that are the same in 
all regards except for one curve, where two groups have distinctly different 
curves but maybe would not be obviously distinctively different if an 
insufficient number of semi-landmarks (or none) were used to characterize the 
curve.  If one were to (maybe simulate this example and) use one sparse 
representation of landmarks and one dense representation, perform a 
cross-validation classification analysis, and calculate posterior 
classification probabilities (let’s assume equal sample sizes and, therefore, 
equal prior probabilities), I would expect that the posterior probabilities of 
the dense landmark configuration would better assign specimens to the 
appropriate process that generated them (i.e., their correct groups).  The 
posterior probabilities would be closer to 0 and 1 because of the “added 
covariance”, as reflected by the squared generalized Mahalanobis distances, 
based on the pooled within-group covariance.  The added covariance would be 
essential for the posterior probabilities, if the sparse configurations 
produced similar generalized distances to group means, and therefore, similar 
posterior probabilities for classification.

I’m not sure adding covariance is an issue.  To me it simply changes the 
hypothetical (null) covariance structure, which Philipp mentioned should 
probably not be assumed to be independent (isotropic).  I think your example 
might best highlight that a different multivariate normal distribution of 
residuals is to be expected with a different configuration.

Cheers!
Mike


> On Nov 6, 2018, at 12:34 PM, alcardini  wrote:
> 
> Yes, but doesn't that also add more covariance that wasn't there in
> the first place?
> Neither least squares nor minimum bending energy, that we minimize for
> sliding, are biological models: they will reduce variance but will do
> it in ways that are totally biologically arbitrary.
> 
> In the examples I showed sliding led to the appearance of patterns
> from totally random data and that effect was much stronger than
> without sliding.
> I neither advocate sliding or not sliding. Semilandmarks are different
> from landmarks and more is not necessarily better. There are
> definitely some applications where I find them very useful but many
> more where they seem to be there just to make cool pictures.
> 
> As Mike said, we've already had this discussion. Besides different
> views on what to measure and why, at that time I hadn't appreciated
> the problem with p/n and the potential strength of the patterns
> introduced by the covariance created by the superimposition (plus
> sliding!).
> 
> Cheers
> 
> Andrea
> 
> On 06/11/2018, F. James Rohlf  > wrote:
>> I agree with Philipp but I would like to add that the way I think about the
>> justification for the sliding of semilandmarks is that if one were smart
>> enough to know exactly where the most meaningful locations are along some
>> curve then one should just place the points along the curve and
>> computationally treat them as fixed landmarks. However, if their exact
>> positions are to some extend arbitrary (usually the case) although still
>> along a defined curve then sliding makes sense to me as it minimizes the
>> apparent differences among specimens (the sliding minimizes your measure of
>> how much specimens differ from each other or, usually, the mean shape.
>> 
>> 
>> 
>> _ _ _ _ _ _ _ _ _
>> 
>> F. James Rohlf, Distinguished Prof. Emeritus
>> 
>> 
>> 
>> Depts. of Anthropology and of Ecology & Evolution
>> 
>> 
>> 
>> 
>> 
>> From: mitte...@univie.ac.at 
>> Sent: Tuesday, November 6, 2018 9:09 AM
>> To: MORPHMET 
>> Subject: [MORPHMET] Re: semilandmarks in biology
>> 
>> 
>> 
>> I agree only in part.
>> 
>> 
>> 
>> Whether or not semilandmarks "really are needed" may be hard to say
>> beforehand. If the signal is known well enough before the study, even a
>> single linear distance or distance ratio may suffice. In fact, most
>> geometric morphometric studies are characterized by an oversampling of
>> (anatomical) landmarks as an exploratory strategy: it allows for unexpected
>> findings (and nice visualizations).
>> 
>> 
>> 
>> Furthermore, there is a fundamental difference between sliding semilandmarks
>> and other outline methods, including EFA. When establishing correspondence
>> of semilandmarks across individuals, the minBE sliding algorithm takes the
>> anatomical landmarks (and their stronger biological homology) into account,
>> while standard EFA and related techniques cannot easily combine point
>> homology with curve or surface homology. 

RE: [MORPHMET] Re: semilandmarks in biology

2018-11-06 Thread F. James Rohlf
Perhaps, but Procrustes superimposition already adds lots of covariances also. 
It is a bit tricky (meaning that I do not know of a good solution) to preserve 
the "real" covariances and distinguish them from artifacts of fitting. GM works 
well for testing differences among means of groups but studying covariances 
among shape variables is a much more difficult problem. Some ML approaches have 
been suggested that could minimize the covariances due to superimposition but 
the ones I have looked at require some very unreasonable biological assumptions 
about their statistical properties.

Such discussions will not die until there are good solutions or else someone 
proves that no good solution is possible.  I still have hope for some clever 
idea.

_ _ _ _ _ _ _ _ _
F. James Rohlf, Distinguished Prof. Emeritus

Depts. of Anthropology and of Ecology & Evolution


-Original Message-
From: alcardini  
Sent: Tuesday, November 6, 2018 12:35 PM
To: F. James Rohlf 
Cc: mitte...@univie.ac.at; MORPHMET 
Subject: Re: [MORPHMET] Re: semilandmarks in biology

Yes, but doesn't that also add more covariance that wasn't there in the first 
place?
Neither least squares nor minimum bending energy, that we minimize for sliding, 
are biological models: they will reduce variance but will do it in ways that 
are totally biologically arbitrary.

In the examples I showed sliding led to the appearance of patterns from totally 
random data and that effect was much stronger than without sliding.
I neither advocate sliding or not sliding. Semilandmarks are different from 
landmarks and more is not necessarily better. There are definitely some 
applications where I find them very useful but many more where they seem to be 
there just to make cool pictures.

As Mike said, we've already had this discussion. Besides different views on 
what to measure and why, at that time I hadn't appreciated the problem with p/n 
and the potential strength of the patterns introduced by the covariance created 
by the superimposition (plus sliding!).

Cheers

Andrea

On 06/11/2018, F. James Rohlf  wrote:
> I agree with Philipp but I would like to add that the way I think 
> about the justification for the sliding of semilandmarks is that if 
> one were smart enough to know exactly where the most meaningful 
> locations are along some curve then one should just place the points 
> along the curve and computationally treat them as fixed landmarks. 
> However, if their exact positions are to some extend arbitrary 
> (usually the case) although still along a defined curve then sliding 
> makes sense to me as it minimizes the apparent differences among 
> specimens (the sliding minimizes your measure of how much specimens differ 
> from each other or, usually, the mean shape.
>
>
>
> _ _ _ _ _ _ _ _ _
>
> F. James Rohlf, Distinguished Prof. Emeritus
>
>
>
> Depts. of Anthropology and of Ecology & Evolution
>
>
>
>
>
> From: mitte...@univie.ac.at 
> Sent: Tuesday, November 6, 2018 9:09 AM
> To: MORPHMET 
> Subject: [MORPHMET] Re: semilandmarks in biology
>
>
>
> I agree only in part.
>
>
>
> Whether or not semilandmarks "really are needed" may be hard to say 
> beforehand. If the signal is known well enough before the study, even 
> a single linear distance or distance ratio may suffice. In fact, most 
> geometric morphometric studies are characterized by an oversampling of
> (anatomical) landmarks as an exploratory strategy: it allows for 
> unexpected findings (and nice visualizations).
>
>
>
> Furthermore, there is a fundamental difference between sliding 
> semilandmarks and other outline methods, including EFA. When 
> establishing correspondence of semilandmarks across individuals, the 
> minBE sliding algorithm takes the anatomical landmarks (and their 
> stronger biological homology) into account, while standard EFA and 
> related techniques cannot easily combine point homology with curve or 
> surface homology. Clearly, when point homology exists, it should be 
> parameterized accordingly. If smooth curves or surfaces exists, they 
> should also be parameterized, whether or not this makes the analysis slightly 
> more challenging.
>
>
>
> Anyway, different landmarks often convey different biological signals 
> and different homology criteria. For instance, Type I and Type II 
> landmarks (sensu Bookstein 1991) differ fundamentally in their notion of 
> homology.
> Whereas Type I landmarks are defined in terms of local anatomy or 
> histology, a Type II landmark is a purely geometric construct, which 
> may or may not coincide with notions of anatomical/developmental 
> homology. ANY reasonable morphometric analysis must be interpreted in 
> the light of the correspondence function employed, and the some holds 
> true for semilandmarks. For this, of course, one needs to understand 
> the basic properties of sliding landmarks, much as the basic 
> properties of Procrustes alignment, etc.. For instance, both the 
> sliding algorithm and 

Re: [MORPHMET] Re: semilandmarks in biology

2018-11-06 Thread alcardini
Yes, but doesn't that also add more covariance that wasn't there in
the first place?
Neither least squares nor minimum bending energy, that we minimize for
sliding, are biological models: they will reduce variance but will do
it in ways that are totally biologically arbitrary.

In the examples I showed sliding led to the appearance of patterns
from totally random data and that effect was much stronger than
without sliding.
I neither advocate sliding or not sliding. Semilandmarks are different
from landmarks and more is not necessarily better. There are
definitely some applications where I find them very useful but many
more where they seem to be there just to make cool pictures.

As Mike said, we've already had this discussion. Besides different
views on what to measure and why, at that time I hadn't appreciated
the problem with p/n and the potential strength of the patterns
introduced by the covariance created by the superimposition (plus
sliding!).

Cheers

Andrea

On 06/11/2018, F. James Rohlf  wrote:
> I agree with Philipp but I would like to add that the way I think about the
> justification for the sliding of semilandmarks is that if one were smart
> enough to know exactly where the most meaningful locations are along some
> curve then one should just place the points along the curve and
> computationally treat them as fixed landmarks. However, if their exact
> positions are to some extend arbitrary (usually the case) although still
> along a defined curve then sliding makes sense to me as it minimizes the
> apparent differences among specimens (the sliding minimizes your measure of
> how much specimens differ from each other or, usually, the mean shape.
>
>
>
> _ _ _ _ _ _ _ _ _
>
> F. James Rohlf, Distinguished Prof. Emeritus
>
>
>
> Depts. of Anthropology and of Ecology & Evolution
>
>
>
>
>
> From: mitte...@univie.ac.at 
> Sent: Tuesday, November 6, 2018 9:09 AM
> To: MORPHMET 
> Subject: [MORPHMET] Re: semilandmarks in biology
>
>
>
> I agree only in part.
>
>
>
> Whether or not semilandmarks "really are needed" may be hard to say
> beforehand. If the signal is known well enough before the study, even a
> single linear distance or distance ratio may suffice. In fact, most
> geometric morphometric studies are characterized by an oversampling of
> (anatomical) landmarks as an exploratory strategy: it allows for unexpected
> findings (and nice visualizations).
>
>
>
> Furthermore, there is a fundamental difference between sliding semilandmarks
> and other outline methods, including EFA. When establishing correspondence
> of semilandmarks across individuals, the minBE sliding algorithm takes the
> anatomical landmarks (and their stronger biological homology) into account,
> while standard EFA and related techniques cannot easily combine point
> homology with curve or surface homology. Clearly, when point homology
> exists, it should be parameterized accordingly. If smooth curves or surfaces
> exists, they should also be parameterized, whether or not this makes the
> analysis slightly more challenging.
>
>
>
> Anyway, different landmarks often convey different biological signals and
> different homology criteria. For instance, Type I and Type II landmarks
> (sensu Bookstein 1991) differ fundamentally in their notion of homology.
> Whereas Type I landmarks are defined in terms of local anatomy or histology,
> a Type II landmark is a purely geometric construct, which may or may not
> coincide with notions of anatomical/developmental homology. ANY reasonable
> morphometric analysis must be interpreted in the light of the correspondence
> function employed, and the some holds true for semilandmarks. For this, of
> course, one needs to understand the basic properties of sliding landmarks,
> much as the basic properties of Procrustes alignment, etc.. For instance,
> both the sliding algorithm and Procrustes alignment introduce correlations
> between shape coordinates (hence their reduced degrees of freedom). This is
> one of the reasons why I have warned for many years and in many publications
> about the biological interpretation of raw correlations (e.g., summarized in
> Mitteroecker et al. 2012 Evol Biol). Interpretations in terms of
> morphological integration or modularity are even more difficult because in
> most studies these concepts are not operationalized. They are either
> described by vague and biologically trivial narratives, or they are
> themselves defined as patterns of correlations, which is circular and makes
> most "hypotheses" untestable.
>
>
>
> The same criticism applies to the naive interpretation of PCA scree plots
> and derived statistics. An isotropic (circular) distribution of shape
> coordinates corresponds to no biological model or hypothesis whatsoever
> (e.g., Huttegger & Mitteroecker 2011, Bookstein & Mitteroecker 2014, and
> Bookstein 2015, all three in Evol Biol). Accordingly, a deviation from
> isometry does not itself inform about integration or modularity (in any
> 

RE: [MORPHMET] Re: semilandmarks in biology

2018-11-06 Thread F. James Rohlf
I agree with Philipp but I would like to add that the way I think about the 
justification for the sliding of semilandmarks is that if one were smart enough 
to know exactly where the most meaningful locations are along some curve then 
one should just place the points along the curve and computationally treat them 
as fixed landmarks. However, if their exact positions are to some extend 
arbitrary (usually the case) although still along a defined curve then sliding 
makes sense to me as it minimizes the apparent differences among specimens (the 
sliding minimizes your measure of how much specimens differ from each other or, 
usually, the mean shape. 

 

_ _ _ _ _ _ _ _ _

F. James Rohlf, Distinguished Prof. Emeritus



Depts. of Anthropology and of Ecology & Evolution

 

 

From: mitte...@univie.ac.at  
Sent: Tuesday, November 6, 2018 9:09 AM
To: MORPHMET 
Subject: [MORPHMET] Re: semilandmarks in biology

 

I agree only in part.

 

Whether or not semilandmarks "really are needed" may be hard to say beforehand. 
If the signal is known well enough before the study, even a single linear 
distance or distance ratio may suffice. In fact, most geometric morphometric 
studies are characterized by an oversampling of (anatomical) landmarks as an 
exploratory strategy: it allows for unexpected findings (and nice 
visualizations). 

 

Furthermore, there is a fundamental difference between sliding semilandmarks 
and other outline methods, including EFA. When establishing correspondence of 
semilandmarks across individuals, the minBE sliding algorithm takes the 
anatomical landmarks (and their stronger biological homology) into account, 
while standard EFA and related techniques cannot easily combine point homology 
with curve or surface homology. Clearly, when point homology exists, it should 
be parameterized accordingly. If smooth curves or surfaces exists, they should 
also be parameterized, whether or not this makes the analysis slightly more 
challenging.

 

Anyway, different landmarks often convey different biological signals and 
different homology criteria. For instance, Type I and Type II landmarks (sensu 
Bookstein 1991) differ fundamentally in their notion of homology. Whereas Type 
I landmarks are defined in terms of local anatomy or histology, a Type II 
landmark is a purely geometric construct, which may or may not coincide with 
notions of anatomical/developmental homology. ANY reasonable morphometric 
analysis must be interpreted in the light of the correspondence function 
employed, and the some holds true for semilandmarks. For this, of course, one 
needs to understand the basic properties of sliding landmarks, much as the 
basic properties of Procrustes alignment, etc.. For instance, both the sliding 
algorithm and Procrustes alignment introduce correlations between shape 
coordinates (hence their reduced degrees of freedom). This is one of the 
reasons why I have warned for many years and in many publications about the 
biological interpretation of raw correlations (e.g., summarized in Mitteroecker 
et al. 2012 Evol Biol). Interpretations in terms of morphological integration 
or modularity are even more difficult because in most studies these concepts 
are not operationalized. They are either described by vague and biologically 
trivial narratives, or they are themselves defined as patterns of correlations, 
which is circular and makes most "hypotheses" untestable.

 

The same criticism applies to the naive interpretation of PCA scree plots and 
derived statistics. An isotropic (circular) distribution of shape coordinates 
corresponds to no biological model or hypothesis whatsoever (e.g., Huttegger & 
Mitteroecker 2011, Bookstein & Mitteroecker 2014, and Bookstein 2015, all three 
in Evol Biol). Accordingly, a deviation from isometry does not itself inform 
about integration or modularity (in any reasonable biological sense).

The multivariate distribution of shape coordinates, including "dominant 
directions of variation," depend on many arbitrary factors, including the 
spacing, superimposition, and sliding of landmarks as well as on the number of 
landmarks relative to the number of cases. But all of this applies to both 
anatomical landmarks and sliding semilandmarks.

 

I don't understand how the fact that semilandmarks makes some of these issues 
more obvious is an argument against their use.

 

Best,

 

Philipp

 

 

 

 

 

 


Am Dienstag, 6. November 2018 13:28:55 UTC+1 schrieb alcardini:

As a biologist, for me, the question about whether or not to use semilandmarks 
starts with whether I really need them and what they're actually measuring.

On this, among others, Klingenberg, O'Higgins and Oxnard have written some very 
important easy-to-read papers that everyone doing morphometrics should consider 
and carefully ponder. They can be found at: 
https://preview.tinyurl.com/semilandmarks

I've included there also an older criticism by O'Higgins on EFA and related 
methods. 

[MORPHMET] Re: semilandmarks in biology

2018-11-06 Thread mitte...@univie.ac.at
I agree only in part.

Whether or not semilandmarks "really are needed" may be hard to say 
beforehand. If the signal is known well enough before the study, even a 
single linear distance or distance ratio may suffice. In fact, most 
geometric morphometric studies are characterized by an oversampling of 
(anatomical) landmarks as an exploratory strategy: it allows for unexpected 
findings (and nice visualizations). 

Furthermore, there is a fundamental difference between sliding 
semilandmarks and other outline methods, including EFA. When establishing 
correspondence of semilandmarks across individuals, the minBE sliding 
algorithm takes the anatomical landmarks (and their stronger biological 
homology) into account, while standard EFA and related techniques cannot 
easily combine point homology with curve or surface homology. Clearly, when 
point homology exists, it should be parameterized accordingly. If smooth 
curves or surfaces exists, they should also be parameterized, whether or 
not this makes the analysis slightly more challenging.
 
Anyway, different landmarks often convey different biological signals and 
different homology criteria. For instance, Type I and Type II landmarks 
(sensu Bookstein 1991) differ fundamentally in their notion of homology. 
Whereas Type I landmarks are defined in terms of local anatomy or 
histology, a Type II landmark is a purely geometric construct, which may or 
may not coincide with notions of anatomical/developmental homology. ANY 
reasonable morphometric analysis must be interpreted in the light of the 
correspondence function employed, and the some holds true for 
semilandmarks. For this, of course, one needs to understand the basic 
properties of sliding landmarks, much as the basic properties of Procrustes 
alignment, etc.. For instance, both the sliding algorithm and Procrustes 
alignment introduce correlations between shape coordinates (hence their 
reduced degrees of freedom). This is one of the reasons why I have warned 
for many years and in many publications about the biological interpretation 
of raw correlations (e.g., summarized in Mitteroecker et al. 2012 Evol 
Biol). Interpretations in terms of morphological integration or modularity 
are even more difficult because in most studies these concepts are not 
operationalized. They are either described by vague and biologically 
trivial narratives, or they are themselves defined as patterns of 
correlations, which is circular and makes most "hypotheses" untestable.

The same criticism applies to the naive interpretation of PCA scree plots 
and derived statistics. An isotropic (circular) distribution of shape 
coordinates corresponds to no biological model or hypothesis whatsoever 
(e.g., Huttegger & Mitteroecker 2011, Bookstein & Mitteroecker 2014, and 
Bookstein 2015, all three in Evol Biol). Accordingly, a deviation from 
isometry does not itself inform about integration or modularity (in any 
reasonable biological sense).
The multivariate distribution of shape coordinates, including "dominant 
directions of variation," depend on many arbitrary factors, including the 
spacing, superimposition, and sliding of landmarks as well as on the number 
of landmarks relative to the number of cases. But all of this applies to 
both anatomical landmarks and sliding semilandmarks.

I don't understand how the fact that semilandmarks makes some of these 
issues more obvious is an argument against their use.

Best,

Philipp







Am Dienstag, 6. November 2018 13:28:55 UTC+1 schrieb alcardini:
>
> As a biologist, for me, the question about whether or not to use 
> semilandmarks starts with whether I really need them and what they're 
> actually measuring.
>
> On this, among others, Klingenberg, O'Higgins and Oxnard have written some 
> very important easy-to-read papers that everyone doing morphometrics should 
> consider and carefully ponder. They can be found at: 
> https://preview.tinyurl.com/semilandmarks
>
> I've included there also an older criticism by O'Higgins on EFA and 
> related methods. As semilandmarks, EFA and similar methods for the analysis 
> of outlines measure curves (or surfaces) where landmarks might be few or 
> missing: if semilandmarks are OK because where the points map is 
> irrelevant, as long as they capture homologous curves or surfaces, the same 
> applies for EFAs and related methods; however, the opposite is also true 
> and, if there are problems with 'homology' in EFA etc., those problems are 
> there also using semilandmarks as a trick to discretize curves and 
> surfaces. 
>
> Even with those problems, one could still have valid reasons to use 
> semilandmarks but it should be honestly acknowledged that they are the best 
> we can do (for now at least) in very difficult cases. Most of the studies I 
> know (certainly a minority from a now huge literature) seem to only provide 
> post-hoc justification of the putative importance of semilandmarks: there 
> were few 'good landmarks'; 

Re: [MORPHMET] Re: Are more semi landmarks better??

2018-11-06 Thread Mike Collyer
Philipp’s message below felt a little like a déjà vu moment.  I checked the 
Morphmet archives and sure enough, we had a similar thread back in late 
May/Early June, 2017.  Diego, you might want to check that thread, as a lot of 
what was discussed is relevant to your current questions.

Cheers!
Mike

> On Nov 6, 2018, at 5:33 AM, mitte...@univie.ac.at wrote:
> 
> I'd like to respond to your question because it comes up so often.
> 
> As noted by Carmelo in the other posting, a large number of variables 
> relative to the number of cases can lead to statistical problems. But often 
> it does not.
> 
> In all analyses that treat each variable separately - including the 
> computation of mean shapes and shape regressions - the number of variables 
> does NOT matter! Also in principal component analysis (PCA) and between-group 
> PCA there is NO restriction on the number of variables. However, the 
> distribution of landmarks across the organism can influence the results. 
> E.g., if one part - say the face - is covered only by a few anatomical 
> landmarks, and another part - e.g., the neurocranium - by many semilandmarks, 
> the latter one will dominate PCA results. But this holds true for all kinds 
> of landmarks and variables, not only for semilandmarks.
> 
> Analyses that involve the inversion of a covariance matrix - such as multiple 
> regression, CVA, relative eigenanalysis, reduced rank regression, and 
> parametric multivariate tests - require a clear excess of cases over 
> variables. In any truly multivariate setting (such as geometric 
> morphometrics), these analyses - if unavoidable - should ALWAYS be preceded 
> by some sort of variable reduction and/or factor analysis. Again, this is not 
> specific to semilandmarks.
> 
> Partial least squares (PLS) is somewhat in-between these to groups. As shown 
> in Bookstein's 2016 paper, the singular values (maximal covariances) in PLS 
> can be strongly inflated if the number of variables is large compared to the 
> number of cases. The singular vectors, however, are more stable.
> 
> Essentially, the number of semilandmarks should be determined based on the 
> anatomical details to be captured. More semilandmarks are not "harmful," 
> perhaps just a waste of time.
> 
> Best,
> 
> Philipp Mitteroecker
> 
> 
> 
> 
>  
> 
> 
> Am Montag, 5. November 2018 18:52:57 UTC+1 schrieb Diego Ardón:
> Good day everybody, I actually have twoI'd like to respond to your question 
> because it comes up so often.
> 
> As noted by Carmelo in the other posting, a large number of variables 
> relative to the number of cases can lead to statistical problems. But often 
> it does not.
> 
> In all analyses that treat each variable separately - including the 
> computation of mean shapes and shape regressions - the number of variables 
> does NOT matter! Also in principal component analysis (PCA) and between-group 
> PCA there is NO restriction on the number of variables. However, the 
> distribution of landmarks across the organism can influence the results. 
> E.g., if one part - say the face - is covered only by a few anatomical 
> landmarks, and another part - e.g., the neurocranium - by many semilandmarks, 
> the latter one will dominate PCA results. But this holds true for all kinds 
> of landmarks and variables, not only for semilandmarks.
> 
> Analyses that involve the inversion of a covariance matrix - such as multiple 
> regression, CVA, relative eigenanalysis, reduced rank regression, and 
> parametric multivariate tests - require a clear excess of cases over 
> variables. In any truly multivariate setting (such as geometric 
> morphometrics), these analyses - if unavoidable - should ALWAYS be preceded 
> by some sort of variable reduction and/or factor analysis. Again, this is not 
> specific to semilandmarks.
> 
> Partial least squares (PLS) is somewhat in-between these to groups. As shown 
> in Bookstein's 2016 paper, the singular values (maximal covariances) in PLS 
> can be strongly inflated if the number of variables is large compared to the 
> number of cases. The singular vectors, however, are more stable.
> 
> Essentially, the number of semilandmarks should be determined based on the 
> anatomical details to be captured. More semilandmarks are not "harmful", 
> perhaps just a waste of time.
> 
> Best,
> 
> Philipp Mitteroecker
> 
> 
> 
> 
>  
>  questions here regarding semi-landmarks:
> 
> So, I was adviced to use semi-landmarks, I placed them with MakeFan8, saved 
> the files as images and then used TpsDig to place all landmarks, however I 
> didn't make any distinctions between landmarks and semi-landmarks. What 
> unsettles me is (1) that I've recently comed across the term "sliding 
> semi-landmarks", which leads me to believe semi-landmarks should behave in a 
> particular way. 
> 
> The second thing that unsettles me is whether "more semi-landmarks" means a 
> better analysis. I can understand that most people wouldn't use 65 
> 

[MORPHMET] Re: Are more semi landmarks better??

2018-11-06 Thread mitte...@univie.ac.at
I'd like to respond to your question because it comes up so often.

As noted by Carmelo in the other posting, a large number of variables 
relative to the number of cases can lead to statistical problems. But often 
it does not.

In all analyses that treat each variable separately - including the 
computation of mean shapes and shape regressions - the number of variables 
does NOT matter! Also in principal component analysis (PCA) and 
between-group PCA there is NO restriction on the number of variables. 
However, the distribution of landmarks across the organism can influence 
the results. E.g., if one part - say the face - is covered only by a few 
anatomical landmarks, and another part - e.g., the neurocranium - by many 
semilandmarks, the latter one will dominate PCA results. But this holds 
true for all kinds of landmarks and variables, not only for semilandmarks.

Analyses that involve the inversion of a covariance matrix - such as 
multiple regression, CVA, relative eigenanalysis, reduced rank regression, 
and parametric multivariate tests - require a clear excess of cases over 
variables. In any truly multivariate setting (such as geometric 
morphometrics), these analyses - if unavoidable - should ALWAYS be preceded 
by some sort of variable reduction and/or factor analysis. Again, this is 
not specific to semilandmarks.

Partial least squares (PLS) is somewhat in-between these to groups. As 
shown in Bookstein's 2016 paper, the singular values (maximal covariances) 
in PLS can be strongly inflated if the number of variables is large 
compared to the number of cases. The singular vectors, however, are more 
stable.

Essentially, the number of semilandmarks should be determined based on the 
anatomical details to be captured. More semilandmarks are not "harmful," 
perhaps just a waste of time.

Best,

Philipp Mitteroecker




 


Am Montag, 5. November 2018 18:52:57 UTC+1 schrieb Diego Ardón:
>
> Good day everybody, I actually have twoI'd like to respond to your 
> question because it comes up so often.
>
> As noted by Carmelo in the other posting, a large number of variables 
> relative to the number of cases can lead to statistical problems. But often 
> it does not.
>
> In all analyses that treat each variable separately - including the 
> computation of mean shapes and shape regressions - the number of variables 
> does NOT matter! Also in principal component analysis (PCA) and 
> between-group PCA there is NO restriction on the number of variables. 
> However, the distribution of landmarks across the organism can influence 
> the results. E.g., if one part - say the face - is covered only by a few 
> anatomical landmarks, and another part - e.g., the neurocranium - by many 
> semilandmarks, the latter one will dominate PCA results. But this holds 
> true for all kinds of landmarks and variables, not only for semilandmarks.
>
> Analyses that involve the inversion of a covariance matrix - such as 
> multiple regression, CVA, relative eigenanalysis, reduced rank regression, 
> and parametric multivariate tests - require a clear excess of cases over 
> variables. In any truly multivariate setting (such as geometric 
> morphometrics), these analyses - if unavoidable - should ALWAYS be preceded 
> by some sort of variable reduction and/or factor analysis. Again, this is 
> not specific to semilandmarks.
>
> Partial least squares (PLS) is somewhat in-between these to groups. As 
> shown in Bookstein's 2016 paper, the singular values (maximal covariances) 
> in PLS can be strongly inflated if the number of variables is large 
> compared to the number of cases. The singular vectors, however, are more 
> stable.
>
> Essentially, the number of semilandmarks should be determined based on the 
> anatomical details to be captured. More semilandmarks are not "harmful", 
> perhaps just a waste of time.
>
> Best,
>
> Philipp Mitteroecker
>
>
>
>
>  
>  questions here regarding semi-landmarks:
>
> So, I was adviced to use semi-landmarks, I placed them with MakeFan8, 
> saved the files as images and then used TpsDig to place all landmarks, 
> however I didn't make any distinctions between landmarks and 
> semi-landmarks. What unsettles me is (1) that I've recently comed across 
> the term "sliding semi-landmarks", which leads me to believe semi-landmarks 
> should behave in a particular way. 
>
> The second thing that unsettles me is whether "more semi-landmarks" means 
> a better analysis. I can understand that most people wouldn't use 65 
> landmarks+semilandmarks because it's a painstaking job to digitize them, 
> however, in my recent reads I've comed across concepts like a "Variables to 
> specimen ratio", which one paper suggested specimens should be 5 times the 
> number of variables. I do have a a data set of nearly 400 specimens, but it 
> does come short if indeed I should have 65*2*5 specimens!
>
> Please, I'll appreciate some feedback :)
>

-- 
MORPHMET may be accessed via its webpage at 

Re: [MORPHMET] A question regarding "target shape"

2018-11-06 Thread Carmelo Fruciano




On 05/11/2018 18:50, Diego Ardón wrote:

Captura de pantalla 2018-11-05 a la(s) 11.40.26.png


Thank you Mr. Fruciano. I had already made the DFA, but wasn't aware the 
graphical output represented both groups (it certainly makes sense). I 
have a couple of other questions regarding semi-landmarks. I probably 
should start a new topic, but I'll first try out here:



So, I was adviced to use semi-landmarks, I placed them with MakeFan8, 
saved the files as images and then used TpsDig to place all landmarks, 
however I didn't make any distinctions between landmarks and 
semi-landmarks. What unsettles me is (1) that I've recently comed across 
the term "sliding semi-landmarks", which leads me to believe 
semi-landmarks should behave in a particular way.


Well, it's a long topic, but the general idea is that, to account for 
the uncertainty in placement of a semilandmark along a curve, this is 
slid along the curve itself (or, more frequently, its approximation) so 
that ideally only variation perpendicular to the curve (reflecting the 
curvature) is retained.
In current practice, semilandmarks are slid. Various software can do 
this, the most popular for 2D data being certainly tpsRelW by F.J. Rohlf.

A good, recent and accessible treatment of this topic is:

Gunz & Mitteroecker 2013. Semilandmarks: a method for quantifying curves 
and surfaces. Hystrix



The second thing that 
unsettles me is whether "more semi-landmarks" means a better analysis.


Not necessarily.

I 
can understand that most people wouldn't use 65 landmarks+semilandmarks 
because it's a painstaking job to digitize them, however, in my recent 
reads I've comed across concepts like a "Variables to specimen ratio", 
which one paper suggested specimens should be 5 times the number of 
variables. I do have a a data set of nearly 400 specimens, but it does 
come short if indeed I should have 65*2*5 specimens!


There are two issues: 1. whether statistical procedures are defined, 2. 
whether one has enough power and/or how large is error in estimates.


The first issue is easy to deal with: certain statistical procedures 
(for instance, the ones involving matrix inversion) are not defined if 
there are many variables and relatively few cases. These procedures 
simply "don't work". However, there are other alternative procedures 
which do work (e.g., the ones based on distances) and/or workarounds 
(e.g., use of generalized inverses).


The second issue is much more complex and I doubt one can give a 
straightforward answer. In general, the more observations (specimens) 
the better (when one can get them, that is). But the idea of a certain 
number of observations relative to the number of variables is, at best, 
a rule of thumb.


Clearly, having too many variables can create problems and artifacts. An 
interesting recent example of this can be found in


Bookstein 2016 - A newly noticed formula enforces fundamental limits of 
geometric morphometric analyses. Evolutionary Biology


In your particular case, if I were you I would ask myself whether all 
those points/semilandmarks are that necessary to capture biologically 
relevant variation. That is a question that only you can answer, based 
on your knowledge of the biological problem at hand.


Statistical power and reliability of estimates is another issue, which 
is in part dataset-dependent (as well as dependent on which statistical 
procedures you intend to use). An interesting paper dealing with this is


Cardini 2007. Sample size and sampling error in geometric morphometric 
studies of size and shape. Zoomorphology


In general, as said above, it's very hard to give straightforward 
answers to your question.

I hope this still helps, though.
Carmelo


==
Carmelo Fruciano
Institute of Biology
Ecole Normale Superieure - Paris
CNRS
http://www.fruciano.it/research/


El lunes, 5 de noviembre de 2018, 2:12:20 (UTC-6), Carmelo Fruciano 
escribió:




On 03/11/2018 22:28, Diego Ardón wrote:
 > Dear Mr. Soda,
 >
 > Thank you for replying. Your statement " setting one group’s mean
shape
 > to be the starting shape and the other group’s to the target;
this will
 > lead to the most direct comparison. " pretty much describes what
I have
 > in mind to do. Which software could I use to do this? since I
believe
 > MorphoJ will not do it.

Dear Diego,
MorphoJ will actually do it. The easiest is to use what is under the
menu "Discriminant analysis". MorphoJ's user guide has a brief but very
clear description of the graphical output.
I hope this helps.
Best,
Carmelo


-- 



==
Carmelo Fruciano
Institute of Biology
Ecole Normale Superieure - Paris
CNRS
http://www.fruciano.it/research/ 


 > El miércoles, 31 de octubre de 2018, 13:51:07 (UTC-6), K. James Soda
 > escribió:
 >
 >     Dear Mr. Ardón,
 >
 >