I have been trying to get reference data from the ucsc browser into Galaxy, but 
when I try to get the rat genome in gtf format, I only get files halfway 
through chomosome 10.  This happens with both of the available builds for the 
rat.  I am guessing this is a problem with the UCSC files and not Galaxy.  
However, I was wondering if you could perhaps help with this issue.  When I try 
to get just chromosome 10, the gtf file halts in the same place as it does when 
I try to get the whole genome, with this message at the bottom of the file:

chr10   rn4_refGene     start_codon     4293086 4293088 0.000000        +       
.       gene_id "NM_001008876"; transcript_id "NM_001008876_dup1"; 
offsetToGenomic: need previous exon, but given index of 0

Any ideas?   I have no problem importing the "KnownGenes" option in UCSC - 
would this file work well as a reference - what is the difference in the 
different files from UCSC(ie. "gene and gene prediction tracks" vs "mRNA and 
EST tracks" or "KnownGenes" vs "RefSeq)?  I guess I could try to download each 
chromosome separately to avoid that line in chromosome 10 and then concatenate 
them.  I tried downloading the rat GTF from Ensembl, but when I brought it into 
Galaxy, it wasn't formatted properly, and didn't have the NM_0001 accession 
numbers associated with it but rather some other type of label, so it looks 
like it might need some grooming before use with the NGS suite. 

Thank you for any suggestions,
David Martin
The Galaxy User list should be used for the discussion
of Galaxy analysis and other features on the public
server at usegalaxy.org. For discussion of local Galaxy
instances and the Galaxy source code, please use the
Galaxy Development list:


To manage your subscriptions to this and other
Galaxy lists, please use the interface at:


Reply via email to