Hi Laurent,
There's a typo in my email. I meant to write:
1. edgeR will stop reporting tags with extreme variances as
*differentially expressed* if the user reduces the prior weight, prior.n
...
Gordon
On Wed, 2 Mar 2011, Gordon K Smyth wrote:
Hi Laurent,
Thanks for the nice summary. Two more points:
1. edgeR will stop reporting tags with extreme variances as outliers if the
user reduces the prior weight, prior.n, given to the common dispersion
(expressed in terms of the number of notional prior tags). Seeing such tags
in the topTags table may prompt the user to do this.
2. It would be very helpful to know whether these high variance tags arise
from (i) technical errors specific to one count, (ii) technical issues
affecting a tag or (iii) genuine biological variation. If (i), then we could
design software to detect outlier counts. If (ii), we could design software
to detect outlier tags. If (iii), then an empirical Bayes approach to
moderating the dispersions, such as is done by edgeR, may be the best that
can be done.
I don't know for sure how to distinguish these causes, but here are some
thoughts. In your original post, you showed a tag with a large count for
library A3 but zeros for all other libraries. Is library A3 systematically
different from libraries A1 and A2 for other tags as well as this one? If
this tag is part of co-regulated pathways that are highly expressed in A3
relative to the others, then likely it is real biological variation. If A3
differs from A1 and A2 only in a handful of tags with no biological
connection, then perhaps it is a technical issue.
Regards
Gordon
Date: Tue, 01 Mar 2011 10:25:31 +0100
From: Laurent Gautier <[email protected]>
To: [email protected]
Cc: [email protected]
Subject: Re: [Bioc-sig-seq] RNASeq, differential expression between
group, and large variance within groups
Thanks to Mads, Simon, and Steve.
In summary:
- extreme variance within group (zero or large value) is not a good
sign, and experimental issues are to be suspected
- pooling (summing) tags over reference transcripts can rescue some of
the signal
- DESeq, and to some extent edgeR, will report as differentially
expressed such gene/tags with such pathological counts while they should
not. The issue is acknowledged and care should be taken (here we use
various visualizations to complement the p-values).
Laurent
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing