I am not sure about cuffcompare, but cuffdiff doesn't generate any extra files 
if you add more groups and replicates to the command line. It adds columns to 
the output files but the number of files remains the same.

For a workflow for Martin for now, I would suggest doing this for making calls 
with no novel genes:

1) upload your reads
2) fastq groom them into sanger format
3) run tophat on each lane individually
4) run cuffcompare with the gtf file you downloaded from uscs or wherever 
against itself, 
this puts it in a nice format to use with cuffdiff
5) merge the bam files from tophat for the 10 lanes from each group into one 
6) run cuffdiff using the transcript gtf output file from cuffcompare and the 
two merged bam files

Merging is kind of crappy because you use in-replicate variation information, 
but its the best you can do now. I have patched galaxy to have cuffdiff handle 
replicates and to do normalization, when that gets merged into the main branch 
your workflow will be the same except you won't have to merge all of the bam 
files from each condition together to use cuffdiff.


On Jan 21, 2011, at 9:40 AM, Jeremy Goecks wrote:

> Hi David,
> Cuffcompare and Cuffdiff generate many more outputs than most other tools; 
> specifically, both generate multiple output files for each additional input 
> given. While Galaxy can handle an arbitrary number of inputs easily, handling 
> so many outputs is challenging and requires extending the framework to handle 
> so many output files. 

galaxy-user mailing list

Reply via email to