On Wed, Apr 18, 2012 at 8:37 AM, Jeremy Goecks <jeremy.goe...@emory.edu> wrote:
> I am wondering if these "non-coding reads" will be included when cufflinks
> calculates transcript/gene expression.
> Reads will only be included if they map to assembled/known transcripts.
Well it depends what transcript annotation file you pass to cuffdiff.
If you run cufflinks without using --GTF:
"Tells Cufflinks to use the supplied reference annotation (a GFF file)
to estimate isoform expression. It will not assemble novel
transcripts, and the program will ignore alignments not structurally
compatible with any reference transcript."
In Galaxy language, option "Use Reference Annotation:" with "Use
reference annotation" selected. Then the two other options, "No" or
"Use reference annotation as guide", will allow cufflinks to estimate
unknown transcripts. If later you use cuffmerge to produce the
transcripts annotation from your cufflinks runs and use it for
cuffdiff, the "non-coding reads" will almost for sure pollute your
transcript expression estimates.
Jeremy, do you have a workflow to estimate what percent of the reads
are mapping to unknown expressed regions? I would like to be able to
produce this estimate before I make a decision on which transcripts
annotation I should pass to cuffdiff. I would expect a small percent
of reads to map outside of known expressed regions, but is this number
is to big, then I would like to check for potential problems with my
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
To manage your subscriptions to this and other Galaxy lists,
please use the interface at: