
> This is one thing I would like help with- is it worth simply reducing to 
> nothing the max intron size? What is accepted consensus when using tophat on 
> bacterial genomes?

I'm not sure that folks on this list have much experience with bacterial 
transcriptome analysis. You might try or try emailing the 
Tophat/Cufflinks authors directly: If you find 
something interesting in another place, please feel free to share with the 
Galaxy community.

> When I look at the second tophat file, of accepted hits, all hits align 
> nicely with known genes.  However, when I run cufflinks I run into the 
> following issues: when I use a reference genome, I get in addition to the 
> known transcripts, a bunch of very long transcripts spanning very large 
> genomic regions. Also, I will have two genes that are very near each other 
> but run in opposite directions (which you can see beautifully in the tophat 
> accepted hits alignments - different colors for each strand) but they merge 
> into a single CUFF identifier.  Is there any way I can address this- is it 
> something I am missing with respect to parameters I have to change because I 
> am working on a bacterial genome?

Reference genome or reference gene annotation? Using a genome to correct for 
bias should not change the assembled transcripts, only their expression levels. 
You can use a reference gene annotation either as ground truth or as a guide; 
using the reference as ground truth ensures that Cufflinks will only assemble 
transcripts defined in the annotation.

