Re: [R-sig-phylo] Data normalization

2021-09-19 Thread Jacob Berv
Interesting thought. I am not familiar with these analyses, but it would not 
surprise me if your intuition is correct.
J


> On Sep 19, 2021, at 1:00 PM, Ferenc Tibor Kagan  wrote:
> 
> Dear r-sig-phylo community
> 
> I am writing to you in hopes of you giving me your inputs on a specific topic.
> 
> I noticed a rise of use of PCMs when it comes to gene expression data lately. 
> Many of these studies before fitting a specific model to their expression 
> data do several normalization steps. The common steps in order are to 
> normalize for sequencing depth and gene length, normalize in between 
> replicates within species and finally to normalize across species. For within 
> species and across species I have seen TMM normalization method being used 
> (from edgeR package) or batch effect removal (f.ex. Combat-seq function from 
> sva package).
> 
> My concern is the final normalization step, namely to normalize continuous 
> data across species before model fitting. By doing so wouldn't one minimize 
> the phylogenetic signal present in the dataset, therefore affecting the best 
> fitting model?
> 
> Thank you in advance for your answer.
> 
> Best regards,
> Ferenc Kagan
> 
>   [[alternative HTML version deleted]]
> 
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] Data normalization

2021-09-19 Thread Ferenc Tibor Kagan
Dear r-sig-phylo community

I am writing to you in hopes of you giving me your inputs on a specific topic.

I noticed a rise of use of PCMs when it comes to gene expression data lately. 
Many of these studies before fitting a specific model to their expression data 
do several normalization steps. The common steps in order are to normalize for 
sequencing depth and gene length, normalize in between replicates within 
species and finally to normalize across species. For within species and across 
species I have seen TMM normalization method being used (from edgeR package) or 
batch effect removal (f.ex. Combat-seq function from sva package).

My concern is the final normalization step, namely to normalize continuous data 
across species before model fitting. By doing so wouldn't one minimize the 
phylogenetic signal present in the dataset, therefore affecting the best 
fitting model?

Thank you in advance for your answer.

Best regards,
Ferenc Kagan

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/