Thanks Jeremy,
   I will do it before try the *de novo *assembly.


On Fri, May 18, 2012 at 1:44 PM, Jeremy Goecks <>wrote:

> I find a lot of potential new genes (hundreds or thousands of reads
> aligning to regions where there is no gene annotation),
> This shouldn't be completely unexpected. High-coverage RNA-seq data is
> constantly revealing new exons/splicing/transcripts, even in well-annotated
> genomes.
> I also find new exons for some genes or exons with different sizes. I was
> thinking to do an *de novo* assembly to find new transcripts and genes,
> but I was wondering if there is something else I could do.
> My suggestion: do reference-guided assembly with Cufflinks; this will
> yield both existing and new transcripts.
> For example, maybe I could just extract those regions where thousands of
> reads align (new gene). I know that we can extract the sequence data for
> specific transcript, is it possible to extract reads for regions without
> annotation, only based in the number of reads aligned?
> You could subtract known genes from the Cufflinks assembly to get only
> novel transcripts.
> Best,
> J.
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

Reply via email to