On 5/19/13 1:35 PM, Zain A Alvi wrote:
Thank you for the information regarding the FastQ information. It was
Lately, I have been getting the following error: "Error getting
history update from this server- Bad Gateway". This occurred after I
tried to reupload some pre-aligned/ and indexed BAM files from NCBI
GEO because I was hoping to generate and retrieve FPKM/RPKM values
This has now been resolved, very sorry for the confusion it caused.
Unfortunately, the my old files are still not available on Galaxy and
I get an Internal Server Error when trying to retrieve them. Although
I can get the work flow for them.
Same, resolved now.
I can't recommend a conversion tool, but there are a few on the web that
could be tested out, if you decide to go that route. I do know that
certain GFF3 files directly from FLYBASE have been problematic with the
RNA-seq tools due to duplicated "ID" attributes. I don't know if this is
all versions or not, or just the dm3 version. That said, the issue has
been isolated to a few records (a gene mapping to >1 location), and
there isn't any reason why you shouldn't test out the /D. pseuddobscura/
version and then adjust it, if needed.
The last weird error is that when I use Cuffdiff, I get FPKM of 0 with
p/q values of 1 all the time. When this should not be the case as the
BAM files are from two different organs. This is for every single
gene, hence this indicates that something is wrong. I was able to
retrieve the GTF file from UCSC main with the following settings:
Insect - D. pseuddobscura
Group - Genes and Gene Prediction Tracks
Output format: GTF.
I was wondering should these setting be fine or should I change the
Group to mRNA or some other settings. Although the one that is
avilable on UCSC is old dp3 file from 2004. The latest GFF is 3.1 on
Flybase. I was wondering anyway to convert to a GTF file.
The GTF file from the UCSC Table browser is correct, but Cuffdiff is
looking for attributes that this version of the file does not have. If
you look at the 9th field of the file to examine these attributes and
compare it to the Cuffdiff input documentation, you can see how these
differ. The gene_id and transcript_id are the same value and other
attributes are not present such as tss_id and p_id. There is nothing
wrong with the file, but without these attributes populated a particular
way, certain calculations will not be done.
These variations are just different projects following a slightly
different file specification. Some are content variations, some are
format variations. This is common with this file type family (GFF, GTF,
GFF3). This is why iGenomes creates files specifically for certain
genomes for use with this tool set.
When you do obtain a file that has the format and content you want to
use, double check that the chromosome names are *exactly* the same
between the reference genome, Tophat output, and GTF or GFF3 file.
Mismatches can also lead to calculations being missed.
iGenomes did not produce a file for fruit fly, but you could request one
from them. This is where they publish the data for other genomes, and
there is a link to the project at the top of the page:
Good luck with your project,
Sorry for so many questions. Thank you again for the great help.
Galaxy Support and Training
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at: