-------- Original Message -------- Subject: Re: Splitting up a data set versus combining it for CVA Date: Wed, 10 Aug 2011 07:05:19 -0400 From: Carmelo Fruciano <[email protected]> To: [email protected] morphmet <[email protected]> ha scritto:
-------- Original Message -------- Subject: Splitting up a data set versus combining it for CVA Date: Mon, 8 Aug 2011 08:00:31 -0400 From: Alexandra Wegmann <[email protected]> Reply-To: Alexandra Wegmann <[email protected]> To: [email protected] <[email protected]> Dear Morphometricians I am trying to publish the results of my master thesis. In the thesis I did a separate canonical variates analyses for each question including only the procrusted data of the specimens of interest (i.e. one data set just contained the adults, another one only the juveniles, another one just the laboratory reared specimens etc). A paper should however be as short as possible, therefore I am considering combining all the data into one data set, running the analysis once and then just discussing the groups in question in each chapter (like this I only have to explain the morphological changes associated with each axis once in the paper).
Dear Alexandra, I wasn't able to understand the exact meaning of your question. You first write about separate CVAs "for each question", then you think about pooling all observations. My point is: is there a single "question" or more? A typical example would be if you were interested in the difference between two (or more) groups of individuals (say, sampled at two geographical locations) but in those two groups there is variation due to other causes such as growth-related shape changes. In this case a typical approach (limitations may apply, of course) would be to run a MANCOVA using as covariate size (as proxy for age) to control (in the example I made) for growth-related variation and testing for differences between groups (after growth-related variation has been controlled). General linear models can incorporate both categorical and continuous predictors so you could use (within reason) lab reared/non lab reared, size and so on as predictors controlling for the others (and also probably testing for the interaction terms). Given a specific problem there could be limitations to this approach, but the general idea is widely used. Now, if the "questions" you ask about your groups are fundamentally distinct and depend on the group (for instance, you test for variation between sampling sites in wild specimens and the effect of temperature in lab-reared specimens), I wonder why you would want to pool all observations. Well, I hope that this is of even remote help... Carmelo -- Carmelo Fruciano Dipartimento di Biologia University of Catania Tel. +39 095 7306023 Cell. +39 349 5822831 e-mail [email protected] http://www.fruciano.it/research/ ---------------------------------------------------------------- Universita' di Catania - A.P.Se.Ma. Servizio di Posta Elettronica
