On Wed, 29 Jan 2003, Stein Aerts wrote: > Hi, > > When currently parsing an exported sequence of an Ensembl mouse gene > (using the Export Data function at www.ensembl.org) there appear to be 3 > problems: > I tried to attach an example of an exported sequence of the Igf1 gene > but then the message was bounced because of a suspicious header... > > 1. Some of the exon locations start with .0: > I think this is a bug of the EMBL formatting at Ensembl?
Yes, this is pretty certainly a fault our end, and I think I know where this is. > > FT exon .0:44020..44364 > FT /exon_id="ENSMUSE00000233709" > FT /start_phase=0 > FT /end_phase=0 > > > > 2. The first annotation of a CDS feature is written on the next line > after CDS. This is not found by the EMBL parser. > I think that is is also a bug at Ensembl? > This is probably a line-length issue. I wonder what the right thing to do here is... Hmmm > FT CDS > FT /gene="ENSMUSG00000020053" > > > > 3. Some of the lines cannot be parsed, for example the parser writes to > System.out: "This line could not be parsed: exon 2001..2159" > This one I don't understand, I cannot see a problem for these features? > > FT exon 2001..2159 > FT /exon_id="ENSMUSE00000248454" > FT /start_phase=0 > FT /end_phase=0 > > > > Thank you in advance! > Stein - have you tried Mart inside Ensembl? For most people, this is far easier way to get bulk downloads of stuff in very-easy-to-parse-format. http://www.ensembl.org/Homo_sapiens/martview choose feature list and/or gene structure when you get to output. The Ensembl bugs should be fixed of course... ;) > Stein. > > -- > Stein Aerts BioI@SISTA > K.U.Leuven ESAT-SCD Belgium > http://www.esat.kuleuven.ac.be/~dna/BioI > > > _______________________________________________ > Biojava-l mailing list - [EMAIL PROTECTED] > http://biojava.org/mailman/listinfo/biojava-l > ----------------------------------------------------------------- Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420 <[EMAIL PROTECTED]>. ----------------------------------------------------------------- _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l