Hi Simon, We have isolated the problem but have not made the correction yet. We cannot give an exact date when this will be released.
If you want to check back in a few weeks for an update, that may be the best way to follow-up and see where we are with the correction. Thanks again for your patience, Jennifer On 5/26/10 3:40 PM, Jennifer Jackson wrote: > Hello Simon, > > Thank you for reporting this problem so clearly. We are able to > reproduce and are close to a solution (we have isolated the source of > the problem). > > For now, please ignore CDS lines with start > end (and the immediately > following stop-codon lines). There is a logic problem in our code with > stop codons that span across two exons. > > Once we have a solution, we will send you an update and make the > correction to the Table browser output tools and to download source code. > > We apologize for the inconvenience that this has caused you and your > colleagues and again thank you very much for notifying us about the > problem! > > We will be in touch, > > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > On 5/26/10 1:27 PM, Simon Anders wrote: >> Dear UCSC Genome Browser Team >> >> A question from a user of my software (CC'ed) lead me to notice a >> potential bug in the UCSC Genome Table Browser. >> >> According to the GFF specs, the value in the start column of a GFF or GTF >> file must never be larger than the value in the end column. However, the >> Table Browser does return such lines. >> >> Steps to reproduce: >> >> In the Table Browser, select the "NCBI37/mm9" assembly, the "UCSC Genes" >> track and the "known genes" table. As region, set >> "chr1:40547900-40548100", >> and requested "GTF" output format. >> >> The output contains the following line, describing the last exon of >> transcript 'uc007aug.1' (gene name Il18r1): >> >> chr1 mm9_knownGene CDS 40547903 40547900 0.000000 + 1 gene_id >> "uc007aug.1"; transcript_id "uc007aug.1"; >> >> In this line, the CDS seems to have negative length, the end is left of >> the start! >> >> The other transcripts of this gene do not have such a strange exon, >> rather, the exon seems to actually extend to 40548061. >> >> Also note the two lines following the faulty one: >> >> chr1 mm9_knownGene stop_codon 40547901 40547903 0.000000 + . gene_id >> "uc007aug.1"; transcript_id "uc007aug.1"; >> chr1 mm9_knownGene exon 40547903 40548425 0.000000 + . gene_id >> "uc007aug.1"; transcript_id "uc007aug.1"; >> >> A stop codon is listed that does not appear in the other transcripts of >> the same genes that contain this exon. For example, transcript uc007auh.1 >> (for which this exon is not final) has its open reading frame spanning >> the >> place of the erroneous stop codon: >> >> chr1 mm9_knownGene CDS 40547903 40548061 0.000000 + 2 gene_id >> "uc007auh.1"; transcript_id "uc007auh.1"; >> >> Paola (the user who stumbled over this when my script gave an error >> due to >> the end being before the start) wrote that she encountered 104 such lines >> in the entire mm9 GTF file. >> >> Could it be that you have some bug in the treatment of prematurely >> poly-adenylated transcripts? >> >> Best regards >> Simon Anders >> >> >> +--- >> | Dr. Simon Anders, Dipl.-Phys. >> | European Molecular Biology Laboratory (EMBL), Heidelberg >> | office phone +49-6221-387-8632 >> | preferred (permanent) e-mail: [email protected] >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
