Re: brief comment on non-significance Re: [MORPHMET] procD.allometry with group inclusion

Mike Collyer Mon, 12 Dec 2016 06:34:41 -0800

Andrea,

My opinion on this is that the researcher who has collected the data must 
retain at all times a biological wisdom that supersedes a suggested course of 
action based on results from a statistical test.  If the purpose of a study is 
to assess the allometric pattern of shape variation within populations, then 
maybe the results of a homogeneity of slopes test can be an unnecessary burden. 
 If a researcher wants to compare the mean shapes of different groups but is 
concerned that allometric variation might differ among groups, then a 
homogeneity of slopes test could be an important first step, but I agree that a 
non-significant result should not spur the researcher to immediately conclude a 
common allometry or no allometry is appropriate.  Sample size, variation in 
size among groups, and appropriate distributions of specimen size within groups 
might all be things to think about.


The point you make about a potential type II error is a real concern.  The 
opposite problem is also a real concern.  One might have very large sample 
sizes and sufficient statistical power to suggest that allometric slopes are 
heterogeneous.  However, the coefficient of determination and/or effect size 
for size:group interaction might be quite small.  Just because there is a low 
probability of finding as large of an effect based on thousands of random 
permutations, is one ready to accept that different groups have evolved unique 
allometric trajectories?  It is easy to forget that the choice of “significance 
level” - the a priori acceptable rate of type I error - is arbitrary.  Making 
strong inferential decisions based on a binary decision for an arbitrary 
criterion is probably not wise.  I would argue that instead of focusing on a 
P-value, one could just as arbitrarily, but perhaps more justifiably, choose a 
coefficient of determination of R^2 = 0.10 or an effect size of 2 SD as a 
criterion for whether to retain or omit the interaction coefficients that allow 
for heterogenous slopes.

*** Warning: pedantic discussion on model selection starts here.  Skip if 
unappealing.

One could also turn to model selection approaches.  However, I think 
multivariate generalization for indices like AIC is an area lacking needed 
theoretical research for high-dimensional shape data.  There are two reasons 
for this.  First, the oft-defined AIC is model log-likelihood + 2K, where K is 
the number of coefficients in a linear model (rank of the model design matrix) 
+ 1, where the 1 is the dimension of the value for the variance of the error.  
This is a simplification for univariate data.  The second half of the equation 
is actually 2[pk + 0.5p(p+1)], where p is the number of shape variables and k 
is the rank of the design matrix.  (One might define p as the rank of the shape 
variable matrix - the number of actual dimensions in the tangent space, also 
equal to the number of principal components with positive eigen values greater 
than 0 from a PCA - if using high-dimensional data or small samples.)  Notice 
that substituting 1 for p in this equation gets one back to the 2K, as defined 
first.  The pk part of the equation represents the dimensions of linear model 
coefficients; the 0.5p(p+1) part represents the dimensions of the error 
covariance matrix.  The reason this is important is that one might have picked 
up along the way that a delta AIC of 1-2 means two models are comparable (as if 
with equal likelihoods, they differ by around 1 parameter or less).  This rule 
of thumb would have to be augmented with highly multivariate data to 1*p to 
2*p, which makes it hard to have a good general sense of when models are 
comparable, unless one takes into consideration how many shape variables are in 
use.

Second, the log-likelihood involves calculating the determinant of the error 
covariance matrix, which is problematic for singular matrices, like might be 
found with high-dimensional shape data.  Recently, colleagues and I have used 
plots of the log of the trace of error covariance matrices versus the log of 
parameter penalties - the 2[pk + 0.5p(p+1)] part - as a way of scanning 
candidate models for the one or two that have lower error relative to the 
number of parameters in the model.  Such an approach allows one to have no 
allometric slope, a common allometric slope, and unique allometric slopes, in 
combination with other important factors, and consider many models at once.  
But again, there is a certain level of arbitrariness to this.

*** End pedantic discussion

There are other issues that can be quite real with real data.  For example, if 
one wishes to consider if there are shape differences among groups but first 
wishes to address if there is meaningful allometric shape variation, and 
whether there might be different allometries among groups, a homogeneity of 
slopes test might be done.  But what if it is revealed that one group has all 
small specimens and one group has all large specimens?  The researcher knows 
better than anyone else whether this is sampling error or a biological 
phenomenon.  How to proceed should not rest solely on an outcome from a 
statistical test.  For example, if the specimens are adult organisms and 
represent large individuals within populations, one might want to discuss shape 
differences without adjusting for allometry, as well as discuss size 
differences.  A discussion of allometries in this case might obscure what is 
really most important, that maybe two populations evolved size and shape 
differences because of some ecologically meaningful reason, for example.  

So I agree with you, and more.  “No significance” or “significance” is only 
part of the evaluation.  Effect sizes and assessment of sampling errors, 
biases, or limitations should also be considered.  And no matter what, careful 
communication that reveals the researcher’s logic needs to be made in published 
articles.

Just my opinion,
Mike 

> On Dec 12, 2016, at 2:40 AM, andrea cardini <alcard...@gmail.com> wrote:
> 
> Dear All,
> 
> if I can, I'd add a brief comment on the interpretation of non-significant 
> results. I'd appreciate this to be checked by those with a proper 
> understanding and background on stats (which I haven't!).
> 
> I use Mike's sentence on non-significant slopes as an example but the issue 
> is a general one, although I find it particularly tricky in the context of 
> comparing trajectories (allometries or other) across groups. Mike wisely said 
> "approximately ("If not significant, than the slope vectors are APPROXIMATELY 
> parallel"). With permutations, one might be able to perform tests even when 
> sample sizes are small (and maybe, which is even more problematic, 
> heterogeneous across groups): then, non-significance could simply mean that 
> samples are not large enough to make strong statements (rejection of the null 
> hp) with confidence (i.e., statistical power is low). Especially with short 
> trajectories (allometries or other), it might happen to find n.s. slopes with 
> very large angles between the vectors, a case where it is probably hard to 
> conclude that allometries really are parallel. 
> That of small samples is a curse of many studies in taxonomy and evolution. 
> We've done a couple of exploratory (non-very-rigorous!) empirical analyses of 
> the effect of reducing sample sizes on means, variances, vector angles etc. 
> in geometric morphometrics (Cardini & Elton, 2007, Zoomorphol.; Cardini et 
> al., 2015, Zoomorphol.) and some, probably, most of these, literally blow up 
> when N goes down. That happened even when differences were relatively large 
> (species separated by several millions of years of independent evolution or 
> samples including domestic breeds hugely different from their wild 
> cpunterpart).
> 
> Unless one has done power analyses and/or has very large samples, I'd be 
> careful with the interpretations. There's plenty on this in the difficult 
> (for me) statistical literature. Surely one can do sophisticated power 
> analyses in R and, although probably and unfortunately not used by many, one 
> of the programs of the TPS series (TPSPower) was written by Jim exactly for 
> this aim (possibly not for power analyses in the case of MANCOVAs/vector 
> angles but certainly in the simpler case of comparisons of means).
> 
> Cheers
> 
> 
> Andrea
> 
> On 11/12/16 19:17, Mike Collyer wrote:
>> Dear Tsung,
>> 
>> The geomorph function, advanced.procD.lm, allows one to extract group slopes 
>> and model coefficients.  In fact, procD.allometry is a specialized function 
>> that uses advanced.procD.lm to perform the HOS test and then uses procD.lm 
>> to produce an ANOVA table, depending on the results of the HOS test.  It 
>> also uses the coefficients and fitted values from procD.lm to generate the 
>> various types of regression scores.  In essence, procD.allometry is a 
>> function that carries out several analyses with geomorph base functions, 
>> procD.lm and advanced.procD.lm, in a specified way.  By comparison, the 
>> output is more limited, but one can use the base functions to get much more 
>> output.
>> 
>> In advanced.procD.lm, if one specifies groups and a slope, one of the 
>> outputs is a matrix of slope vectors.  Also, one can perform pairwise tests 
>> to compare either the correlation or angle between slope vectors.
>> 
>> Regarding the operation of the HOS test, it is a permutational test that 
>> does the following: calculate the sum of squared residuals for a “full” 
>> model, shape ~ size + group + size:group and the same for a “reduced” model, 
>> shape ~ size + group.  (The sum of squared residuals is the trace of the 
>> error SSCP matrix, which is the same of the sum of the summed squared 
>> residuals for every shape variable.)    The difference between these two 
>> values is the sum of squares for the size:group effect.  If significantly 
>> large (i.e., is found with low probability in many random permutations), one 
>> can conclude that the coefficients for this effect are collectively large 
>> enough to justify this effect should be retained, as the slope vectors are 
>> (at least in part) not parallel.  If not significant, than the slope vectors 
>> are approximately parallel, and the effect can be removed from the model.  A 
>> randomized residual permutation procedure is used, which randomizes the 
>> residual vectors of the reduced model in each random permutation to obtain 
>> random pseudo-values, repeating the sum of squares calculations each time.
>> 
>> Regarding your final question, yes, you are correct.  In a case like this, 
>> one might conclude that logCS is not a significant source of shape 
>> variation, and proceed with other analyses that do not include it as a 
>> covariate.  In either case - whether is is retained as a covariate or 
>> excluded - advanced.procD.lm will allow one to perform pairwise comparison 
>> tests among groups.
>> 
>> Cheers!
>> Mike
>> 
>>> On Dec 11, 2016, at 10:56 AM, Tsung Fei Khang <tfkh...@um.edu.my 
>>> <mailto:tfkh...@um.edu.my>> wrote:
>>> 
>>> Dear Mike,
>>> 
>>> Many thanks for the reply!
>>> 
>>> When the procD.allometry function performs HOS test with multiple group 
>>> labels given, does it compute the regression vectors for each group, and 
>>> then tests whether the coefficients of these vectors were equal, using some 
>>> multivariate statistical test? If so, is there an option that outputs the 
>>> regression vectors? Given the high frequency of the latter being discussed 
>>> in the primary GM literature, it seems important to be able to extract this 
>>> result from the function.
>>> 
>>> Finally, on the interpretation side - If group variation is significant, 
>>> but not logCS, then under the model shape~size+group, does this imply that 
>>> shape variation is mainly explained by variation in species, and allometry 
>>> is absent?
>>> 
>>> Regards,
>>> 
>>> T.F.
>>> 
>>> On Thursday, December 8, 2016 at 6:08:17 PM UTC+8, Mike Collyer wrote:
>>> Dear Tsung,
>>> 
>>> The procD.allometry function performs two basic processes when groups are 
>>> provided.  First, it does a homogeneity of slopes (HOS) test.  This test 
>>> ascertains whether two or more groups have parallel or unique slopes (the 
>>> latter meaning at least one groups’s slope is different than the others).  
>>> The HOS test constructs two linear models: shape ~ size + group and shape ~ 
>>> size + group + size:group, and performs an analysis of variance to 
>>> determine if the size:group interaction significantly reduces the residual 
>>> error produced.  (Note: log(size) is a possible and default choice in this 
>>> analysis.)
>>> 
>>> After this test, procD.allometry then provides an analysis of variance on 
>>> each term in the resulting model from the HOS test.
>>> 
>>> Regarding your question, if the HOS test reveals there is significant 
>>> heterogeneity in slopes, the coefficients returned allow one to find the 
>>> unique linear equations, by group, which would be found from separate runs 
>>> on procD.allometry, one group at a time.  If the HOS test reveals that 
>>> there is not significant heterogeneity in slopes, the coefficients 
>>> constrain the slopes for different groups to be the same (parallel).  
>>> 
>>> Finally, and I think more to your point, the projected regression scores 
>>> are found by using for a (in the Xa calculation you note) the coefficients 
>>> that represent a common or individual slope from the linear model produced. 
>>>  The matrix of coefficients, B, is arranged as first row = intercept, 
>>> second row = common slope, next rows (if applicable) are coefficients for 
>>> the group factor (essentially change the intercept, by group), and finally, 
>>> the last rows are the coefficients for the size:group interaction (if 
>>> applicable), which change the common slope to match each group’s unique 
>>> slope.  Irrespective of the complexity of this B matrix, a is found as the 
>>> second row.  If you run procD.allometry group by group, it is the same as 
>>> (1) asserting that group slopes are unique and (2) changing a to match not 
>>> the common slope, but the summation of the common slope and the 
>>> group-specific slope adjustment.  One could do that, but would lose the 
>>> ability to compare the groups in the same plot, as each group would be 
>>> projected on a different axis.  
>>> 
>>> Hope that helps.
>>> 
>>> Mike
>>> 
>>> 
>>>> On Dec 8, 2016, at 3:37 AM, Tsung Fei Khang <tfk...@um.edu.my <>> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> I would like to use procD.allometry to study allometry in two species. 
>>>> 
>>>> I understand that the function returns the regression score for each 
>>>> specimen as Reg.proj, and that the calculation is obtained as:
>>>> s = Xa, where X is the nxp matrix of Procrustes shape variables, and a is 
>>>> the px1 vector of regression coefficients normalized to 1. I am able to 
>>>> verify this computation from first principles when all samples are 
>>>> presumed to come from the same species. 
>>>> 
>>>> However, what happens when we are interested in more than 1 species (say 
>>>> 2)? I could run procD.allometry by including the species labels via 
>>>> f2=~gps, where gps gives the species labels. Is there just 1 regression 
>>>> vector (which feels weird, since this should be species-specific), or 2? 
>>>> If so, how can I recover both vectors? What is the difference of including 
>>>> f2=~gps using all data, compared to if we make two separate runs of 
>>>> procD.allometry, one for samples from species 1, and another for samples 
>>>> from species 2?
>>>> 
>>>> Thanks for any help.
>>>> 
>>>> Rgds,
>>>> 
>>>> TF
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> " PENAFIAN: E-mel ini dan apa-apa fail yang dikepilkan bersamanya 
>>>> ("Mesej") adalah ditujukan hanya untuk kegunaan penerima(-penerima) yang 
>>>> termaklum di atas dan mungkin mengandungi maklumat sulit. Anda dengan ini 
>>>> dimaklumkan bahawa mengambil apa jua tindakan bersandarkan kepada, membuat 
>>>> penilaian, mengulang hantar, menghebah, mengedar, mencetak, atau menyalin 
>>>> Mesej ini atau sebahagian daripadanya oleh sesiapa selain daripada 
>>>> penerima(-penerima) yang termaklum di atas adalah dilarang. Jika anda 
>>>> telah menerima Mesej ini kerana kesilapan, anda mesti menghapuskan Mesej 
>>>> ini dengan segera dan memaklumkan kepada penghantar Mesej ini menerusi 
>>>> balasan e-mel. Pendapat-pendapat, rumusan-rumusan, dan sebarang maklumat 
>>>> lain di dalam Mesej ini yang tidak berkait dengan urusan rasmi Universiti 
>>>> Malaya adalah difahami sebagai bukan dikeluar atau diperakui oleh 
>>>> mana-mana pihak yang disebut.
>>>> 
>>>> 
>>>> DISCLAIMER: This e-mail and any files transmitted with it ("Message") is 
>>>> intended only for the use of the recipient(s) named above and may contain 
>>>> confidential information. You are hereby notified that the taking of any 
>>>> action in reliance upon, or any review, retransmission, dissemination, 
>>>> distribution, printing or copying of this Message or any part thereof by 
>>>> anyone other than the intended recipient(s) is strictly prohibited. If you 
>>>> have received this Message in error, you should delete this Message 
>>>> immediately and advise the sender by return e-mail. Opinions, conclusions 
>>>> and other information in this Message that do not relate to the official 
>>>> business of University of Malaya shall be understood as neither given nor 
>>>> endorsed by any of the forementioned. "
>>>> 
>>>> -- 
>>>> MORPHMET may be accessed via its webpage at http://www.morphometrics.org 
>>>> <http://www.morphometrics.org/>
>>>> --- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "MORPHMET" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to morphmet+u...@ <>morphometrics.org <http://morphometrics.org/>.
>>> 
>>> 
>>> " PENAFIAN: E-mel ini dan apa-apa fail yang dikepilkan bersamanya ("Mesej") 
>>> adalah ditujukan hanya untuk kegunaan penerima(-penerima) yang termaklum di 
>>> atas dan mungkin mengandungi maklumat sulit. Anda dengan ini dimaklumkan 
>>> bahawa mengambil apa jua tindakan bersandarkan kepada, membuat penilaian, 
>>> mengulang hantar, menghebah, mengedar, mencetak, atau menyalin Mesej ini 
>>> atau sebahagian daripadanya oleh sesiapa selain daripada 
>>> penerima(-penerima) yang termaklum di atas adalah dilarang. Jika anda telah 
>>> menerima Mesej ini kerana kesilapan, anda mesti menghapuskan Mesej ini 
>>> dengan segera dan memaklumkan kepada penghantar Mesej ini menerusi balasan 
>>> e-mel. Pendapat-pendapat, rumusan-rumusan, dan sebarang maklumat lain di 
>>> dalam Mesej ini yang tidak berkait dengan urusan rasmi Universiti Malaya 
>>> adalah difahami sebagai bukan dikeluar atau diperakui oleh mana-mana pihak 
>>> yang disebut.
>>> 
>>> 
>>> DISCLAIMER: This e-mail and any files transmitted with it ("Message") is 
>>> intended only for the use of the recipient(s) named above and may contain 
>>> confidential information. You are hereby notified that the taking of any 
>>> action in reliance upon, or any review, retransmission, dissemination, 
>>> distribution, printing or copying of this Message or any part thereof by 
>>> anyone other than the intended recipient(s) is strictly prohibited. If you 
>>> have received this Message in error, you should delete this Message 
>>> immediately and advise the sender by return e-mail. Opinions, conclusions 
>>> and other information in this Message that do not relate to the official 
>>> business of University of Malaya shall be understood as neither given nor 
>>> endorsed by any of the forementioned. "
>>> 
>>> -- 
>>> MORPHMET may be accessed via its webpage at http://www.morphometrics.org 
>>> <http://www.morphometrics.org/>
>>> --- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "MORPHMET" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to morphmet+unsubscr...@morphometrics.org 
>>> <mailto:morphmet+unsubscr...@morphometrics.org>.
>> 
>> -- 
>> MORPHMET may be accessed via its webpage at http://www.morphometrics.org 
>> <http://www.morphometrics.org/>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "MORPHMET" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to morphmet+unsubscr...@morphometrics.org 
>> <mailto:morphmet+unsubscr...@morphometrics.org>.
> 
> -- 
> 
> Dr. Andrea Cardini
> Researcher, Dipartimento di Scienze Chimiche e Geologiche, Università di 
> Modena e Reggio Emilia, Via Campi, 103 - 41125 Modena - Italy
> tel. 0039 059 2058472
> 
> Adjunct Associate Professor, School of Anatomy, Physiology and Human Biology, 
> The University of Western Australia, 35 Stirling Highway, Crawley WA 6009, 
> Australia
> 
> E-mail address: alcard...@gmail.com <mailto:alcard...@gmail.com>, 
> andrea.card...@unimore.it <mailto:andrea.card...@unimore.it>
> WEBPAGE: https://sites.google.com/site/alcardini/home/main 
> <https://sites.google.com/site/alcardini/home/main>
> 
> FREE Yellow BOOK on Geometric Morphometrics: 
> http://www.italian-journal-of-mammalogy.it/public/journals/3/issue_241_complete_100.pdf
>  
> <http://www.italian-journal-of-mammalogy.it/public/journals/3/issue_241_complete_100.pdf>
> 
> ESTIMATE YOUR GLOBAL FOOTPRINT: 
> http://www.footprintnetwork.org/en/index.php/GFN/page/calculators/ 
> <http://www.footprintnetwork.org/en/index.php/GFN/page/calculators/>
> 
> -- 
> MORPHMET may be accessed via its webpage at http://www.morphometrics.org 
> <http://www.morphometrics.org/>
> --- 
> You received this message because you are subscribed to the Google Groups 
> "MORPHMET" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to morphmet+unsubscr...@morphometrics.org 
> <mailto:morphmet+unsubscr...@morphometrics.org>.

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
--- 
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org.

Re: brief comment on non-significance Re: [MORPHMET] procD.allometry with group inclusion

Reply via email to