Dear Carmelo,

Thank you very much again for such a detailed discussion and suggestions!
Now I have a more focused idea of what I should consider and try first.
I will perform some preliminary tests and may contact you privately again!

Best,
Han

在2022年11月2日星期三 UTC 18:25:43<Carmelo Fruciano> 写道:

>
>
> Il 02/11/22 13:32, Han Xiao ha scritto:
> > 
> > Dear Carmelo,
> > 
> > Thank you sooo much for your super details comments and suggestions! 
> > Also again I am very glad that I participated in the Physalia course 
> > with you, not to imagine you still remember me for that!
>
> I do remember most, if not all, of participants to my courses. About 
> your notes, I will try to answer in a semi-general fashion due to the 
> general/public nature of the mailing list. You're welcome to contact me 
> privately for more detailed discussion.
>
> > Some thoughts on your comments:
> > 1. Yes my plan is to only focus on the hybrid morph and the parental 
> > morphs. Maybe a bit more on the shape side, yes I am interested in the 
> > individual head shape, which is quite distinct among morphs. However I 
> > found significant size-shape interaction, and the size distribution of 
> > two morphs (one parental and one hybrid) and the other parental morph 
> > doesn't overlap. Will it be a problem? Or I will just take size also as 
> > a covariable?
>
> Hard to tell without literally working with the data. I also seem to 
> recall that the kind of fish you study differentiate into morphs with 
> distinct sizes. Let us say that, if I were you, I would ask myself: 1. 
> "how bad" the interaction is, which morphs it involves and what it is 
> causing it, 2. whether the allometric variation you are observing within 
> morphs is mostly static or ontogenetic (and whether this is the case 
> within all morphs), and 3. to which extent removing between-morph 
> variation due to size is going to remove interesting biological 
> variation. I suspect that addressing these questions may help you 
> figuring out a sensible course of action with your data.
>
> > For the input file of shape data, should I just try the 
> > coordinate, and then the distance? I guess the distance is more about 
> > differentiation so not be the case for me.
>
> Again, it may depend. But, as a general answer, you may want to consider 
> that transforming everything into distances you lose information about 
> (essentially) "how" shape varies. So, while transforming into distances 
> may be a good idea in certain cases, in most cases you may want to ask 
> yourself whether you really want to lose the information in the form of 
> Procrustes coordinates.
>
>
> > 2. How to select the SNPs if I will take the SNPs as input data. As you 
> > said, it is a great idea to choose the fixed markers from the parental 
> > morphs and encode them as numbers. I will try this to see how many 
> > markers I will get. As I work with ddRAD and missingness up to 30% is 
> > common, I was thinking to filter the SNPs on: 1) low missingness 
> > (present in 90% individuals, etc.) and 2) high Fst value ( I guess the 
> > extreme case will be the fixed ones in the parental morphs?). In a 
> > genetic PCA, we normally replace the missing data with the mean, so 
> > missingness may not be a huge problem but whether the 
> > filtering/sub-sampling of SNPs makes sense is important.
>
> Again, another one where it's hard to give a general answer and one 
> decides on a case-by-case basis. Based on what you say, I would be 
> inclined to: 1. on the full dataset (i.e., not just the SNPs fixed 
> between parental morphs) I would select a "robust" subset of SNPs (not 
> just low missingness but also stringent thresholding and other stuff 
> which reduces the number of your SNPs but increases quality), to have 
> good quality data with few missing SNPs 2. I would run some form of 
> genotype inputation on this global, genome-wide dataset (note that I'm 
> not referring to mere mean replacement of missing data, but more proper 
> genotype inputation), 3. I would select those SNPs which are fixed 
> between parental morphs and biallelic in the hybrid morph (as per 
> previous message).
> To reiterate, this approach may or may not be the right one for you and 
> it's just an idea based on what you wrote.
>
> > 3. Take the ancestry proportion calculated by Admixture as the input. I 
> > also get this idea since it is the most straightforward way to target 
> > introgression. I guess either: 1) 2b-PLS, 2) general linear models, and 
> > 3) a Mantel test may work.
>
> Not sure whether/why you would need a Mantel test (see also above about 
> distances).
>
> > To summarize, as you said, the input file matters for the exact question 
> > I want to ask. Ancestry proportion will transform the high dimensional 
> > genetic variation into a univariate factor. However this might be the 
> > most straightforward way to do it (in different ways) than taking SNPs 
> > of interest, which will maintain more variations but difficult to 
> > interpret the finding, I guess.
>
> One of the many ways in which the two approaches differ depends on 
> whether the same or different regions of the genome are involved in 
> introgression across individuals. For instance, whether two individuals 
> with the same "level of admixture" ("ancestry proportion") will have 
> admixture in the same regions, and whether the genomic regions involved 
> in less admixed inviduals are a subset of the regions involved in more 
> admixed individuals.
>
> I hope the above helps.
> Best,
> Carmelo
>
> > 在2022年11月2日星期三 UTC 05:54:40<Carmelo Fruciano> 写道:
> > 
> > 
> > 
> > Il 01/11/22 18:02, Han Xiao ha scritto:
> > > Dear morpho people,
> > >
> > > I am writing to ask a rarely discussed question, which is to test
> > > associations between genomic data and shape variation.
> > 
> > Dear Han,
> > yes, this is a topic that is perhaps less frequently discussed than
> > others. As a participant to one of the editions of my geometric
> > morphometric course, you may recall we covered this general topic to
> > some extent.
> > 
> > > To describe my system and bit first, I am working with four
> > sympatric
> > > fish morphs in a lake. I have both genetic data (SNP generated by
> > > ddRADSeq, around 12000 SNPs) and shape data (landmarks and
> > > semilandmarks-based GM for the head shape) of the same fish
> > samples. The
> > > genetics indicate that three morphs are genetically distinct,
> > while one
> > > is of hybridization origin between two morphs with different
> > degrees of
> > > introgression. I like to ask the question: for the three related
> > morphs
> > > (two parental morphs and one hybrid morph) there any correlation
> > between
> > > the degree of genetic introgression and the shape variation?
> > 
> > Presumably, if this is your main question and by "shape variation" you
> > mean "individual shape", you would be performing the analysis mainly on
> > the only introgressed morph (whose individuals I have to assume
> > based on
> > your text have varying levels of introgression), using information from
> > the two "parental" morphs.
> > 
> > > I was suggested to apply a 2b-PLS to test it. Then I searched for
> > some
> > > literature and find a few cases. However, the studies vary for
> > the input
> > > data of both genetics and morphometrics. For genetics, people
> > have used
> > > genetic distances (calculated as Fst/(1-Fst), Fst is a
> > measurement of
> > > genetic differentiation), Prevosti distance, allele frequencies
> > (a few
> > > microsatellites), and expression results (numeric and
> > continuous). For
> > > the shape data, people used Eucidean distances, GPA coordinates,
> > > centroid sizes, etc.
> > 
> > About the genetic measures, if the question is about the degree of
> > introgression (of one morph into another when producing the third
> > morph), it is doubtful that any of the measures you mention would
> > adequately capture that. For instance, within your introgressed morph
> > genetic variation among individuals may not be produced exclusively by
> > varying levels of introgression. So FST would be a poor choice because
> > it would capture genetic variation produced by other causes (e.g.,
> > neutral variation). There are other reasons why FST may be a poor
> > choice, but let's keep it at that.
> > 
> > Perhaps a semi-decent solution to quantify the degree of introgression
> > would be to subset your SNP panel to only those SNPs (if any) which are
> > fixed between "parental" morphs (and which are biallelic in the
> > introgressed morph), code them to reflect their "polarity" (e.g., 0 the
> > allele in one parental morph, 1 the allele in the other parental morph)
> > rather than using the actual nucleotides, and then use the data scored
> > this way for your individuals from the introgressed morph to do tests
> > with morphology.
> > 
> > The above is just a very rough solution, with ample margins of
> > improvement depending on the details of your system (e.g., ongoing gene
> > flow between the two "parental" morphs, with most of alleles not being
> > fixed between them). But, as you may imagine, this goes well beyond
> > this
> > brief reply and would require more in-depth knowledge of your specific
> > situation (notice how I had to make several assumptions about how
> > genetic variation is distributed among your morphs).
> > Notice also that if the level of introgression is all you care about
> > (regardless of which loci it comes from) you may obtain a much better
> > and "faster" (i.e., less work for you) solution by using individual
> > estimates of levels of admixture between morphs from one of the
> > software
> > used for analysis of genetic admixture (which you have probably used
> > anyway).
> > 
> > > So my questions are:
> > > 1. Do you all agree that sb-PLS should also make sense for such a
> > > comparison?
> > 
> > PLS may be a good solution to identify how the shape and levels of
> > introgression co-vary. Tests based on a measure of association (e.g.,
> > Escoufier RV) may be used to test the null hypothesis that they are
> > independent.
> > If your estimate of level of introgression is univariate (notice
> > that in
> > the rough solution I suggested above this may not be the case), you may
> > also consider general linear models (and associated tests of
> > significance) using the level of introgression as a predictor.
> > 
> > > 2. What you will suggest for the input files? (I do have some
> > > considerations to discuss)
> > 
> > See above. I suppose the main issue is a bit beyond input files per se
> > and more about how you quantify/represent introgression.
> > 
> > > 3. Is there any other analysis you will recommend? >
> > > I know normally people will use GWAS to search for associations,
> > > however, I am looking for something that can tolerate a smaller
> > sample
> > > size (30 fish per morph).
> > 
> > This is absolutely correct. But, more fundamentally, GWAS' goal is
> > quite
> > distinct from the hypothesis you want to test.
> > 
> > > Also, the potential transgressive shape of
> > > hybrids may be a confounding factor, especially there is different
> > > allometry observed.
> > 
> > Yes. But transgressive segregation may not be a concern if you are just
> > interested in whether and how levels of introgression scored within a
> > single "introgressed" morph is associated to shape variation.
> > 
> > Best,
> > Carmelo
> > 
> > -- 
> > ==================
> > Carmelo Fruciano
> > Italian National Research Council (CNR)
> > IRBIM Messina
> > http://www.fruciano.org/ <http://www.fruciano.org/>
> > ==================
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> > Groups "Morphmet" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> > an email to [email protected] 
> > <mailto:[email protected]>.
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/morphmet2/263cb8f5-b0c8-413f-bda2-d35082109581n%40googlegroups.com
>  
> <
> https://groups.google.com/d/msgid/morphmet2/263cb8f5-b0c8-413f-bda2-d35082109581n%40googlegroups.com?utm_medium=email&utm_source=footer
> >.
>
> -- 
> ==================
> Carmelo Fruciano
> Italian National Research Council (CNR)
> IRBIM Messina
> http://www.fruciano.org/
> ==================
>

-- 
You received this message because you are subscribed to the Google Groups 
"Morphmet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/morphmet2/7a08245d-9215-4dc3-9541-b2762a583291n%40googlegroups.com.

Reply via email to