Thanks Jeremy, I will do it before try the *de novo *assembly. Luciano
On Fri, May 18, 2012 at 1:44 PM, Jeremy Goecks <[email protected]>wrote: > I find a lot of potential new genes (hundreds or thousands of reads > aligning to regions where there is no gene annotation), > > > This shouldn't be completely unexpected. High-coverage RNA-seq data is > constantly revealing new exons/splicing/transcripts, even in well-annotated > genomes. > > I also find new exons for some genes or exons with different sizes. I was > thinking to do an *de novo* assembly to find new transcripts and genes, > but I was wondering if there is something else I could do. > > > My suggestion: do reference-guided assembly with Cufflinks; this will > yield both existing and new transcripts. > > For example, maybe I could just extract those regions where thousands of > reads align (new gene). I know that we can extract the sequence data for > specific transcript, is it possible to extract reads for regions without > annotation, only based in the number of reads aligned? > > > You could subtract known genes from the Cufflinks assembly to get only > novel transcripts. > > Best, > J. > > >
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/

