Sorry for the late response. I have also noticed the problem with uniprot files recently. I am going to file a bug report. In the future, you can submit bugs yourself on bugzilla at http://bugzilla.open-bio.org/.
Best, George -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sofia Burvall Sent: Friday, March 16, 2007 9:31 AM To: [email protected] Subject: [Biojava-l] Uniprot files Hi! I have just started to get to know biojava. I have written a small program that reads a file with the help of the biojavax method RichSequence.IOTools.readFile(filen,ns ); and then tries to write the file as UniProt using RichSequence.IOTools.writeUniProt(System.out, seqit, ns); This works nicely when I read a fasta file. But when I try to read a Uniprot file I get this error message: org.biojava.bio.BioException: Could not read sequence at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence (RichStreamReader.java:113) at org.biojavax.bio.seq.io.RichStreamReader.nextSequence (RichStreamReader.java:92) at org.biojavax.bio.seq.io.RichStreamWriter.writeStream (RichStreamWriter.java:66) at org.biojavax.bio.seq.RichSequence$IOTools.writeUniProt (RichSequence.java:1426) at bc_biojava.GeneralReader.main(GeneralReader.java:81) Caused by: org.biojava.bio.seq.io.ParseException: Bad date line found: 01-JAN-1990 (Rel. 13, Created) at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence (UniProtFormat.java:349) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence (RichStreamReader.java:110) ... 4 more When I try other uniprot files i get the same error. It complains about "Bad date line..". What can be the reason for this? Is it the wrong file format? Cheers /Sofia *** Here is the UniProt flat file: *** ID FOSB_MOUSE STANDARD; PRT; 338 AA. AC P13346; DT 01-JAN-1990 (Rel. 13, Created) DT 01-JAN-1990 (Rel. 13, Last sequence update) DT 15-JUN-2002 (Rel. 41, Last annotation update) DE Protein fosB. GN FOSB. OS Mus musculus (Mouse). OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. OX NCBI_Taxid=10090; RN [1] RP SEQUENCE FROM N.A. RX MEDLINE=89251612; PubMed=2498083; RA Zerial M., Toschi L., Ryseck R.-P., Schuermann M., Mueller R., RA Bravo R.; RT "The product of a novel growth factor activated gene, fos B, interacts RT with JUN proteins enhancing their DNA binding activity."; RL EMBO J. 8:805-813(1989). RN [2] RP SEQUENCE FROM N.A. RX MEDLINE=92158623; PubMed=1741260; RA Lazo P.S., Dorfman K., Noguchi T., Mattei M.-G., Bravo R.; RT "Structure and mapping of the fosB gene. FosB downregulates the RT activity of the fosB promoter."; RL Nucleic Acids Res. 20:343-350(1992). CC -!- FUNCTION: FOSB INTERACTS WITH JUN PROTEINS ENHANCING THEIR DNA CC BINDING ACTIVITY. CC -!- SUBUNIT: HETERODIMER (BY SIMILARITY). CC -!- SUBCELLULAR LOCATION: NUCLEAR. CC -!- INDUCTION: BY GROWTH FACTORS. CC -!- SIMILARITY: BELONGS TO THE BZIP FAMILY. FOS SUBFAMILY. CC ------------------------------------------------------------------------ -- CC This Swiss-Prot entry is copyright. It is produced through a collaboration CC between the Swiss Institute of Bioinformatics and the EMBL outstation - CC the European Bioinformatics Institute. There are no restrictions on its CC use by non-profit institutions as long as its content is in no way CC modified and this statement is not removed. Usage by and for commercial CC entities requires a license agreement (See http://www.isb-sib.ch/ announce/ CC or send an email to [EMAIL PROTECTED]). CC ------------------------------------------------------------------------ -- DR EMBL; X14897; CAA33026.1; -. DR EMBL; AF093624; AAD13196.1; -. DR PIR; S04108; TVMSFB. DR PIR; S35477; S35477. DR HSSP; P01100; 1FOS. DR TRANSFAC; T00291; -. DR MGD; MGI:95575; Fosb. DR InterPro; IPR000837; Leuzip_Fos. DR InterPro; IPR004827; TF_bZIP. DR Pfam; PF00170; bZIP; 1. DR PRINTS; PR00042; LEUZIPPRFOS. DR SMART; SM00338; BRLZ; 1. DR PROSITE; PS00036; BZIP_BASIC; 1. KW Nuclear protein; DNA-binding. FT DNA_BIND 161 179 BASIC MOTIF. FT DOMAIN 183 211 LEUCINE-ZIPPER. SQ SEQUENCE 338 AA; 35976 MW; E9D031A4BEAE48EC CRC64; MFQAFPGDYD SGSRCSSSPS AESQYLSSVD SFGSPPTAAA SQECAGLGEM PGSFVPTVTA ITTSQDLQWL VQPTLISSMA QSQGQPLASQ PPAVDPYDMP GTSYSTPGLS AYSTGGASGS GGPSTSTTTS GPVSARPARA RPRRPREETL TPEEEEKRRV RRERNKLAAA KCRNRRRELT DRLQAETDQL EEEKAELESE IAELQKEKER LEFVLVAHKP GCKIPYEEGP GPGPLAEVRD LPGSTSAKED GFGWLLPPPP PPPLPFQSSR DAPPNLTASL FTHSEVQVLG DPFPVVSPSY TSSFVLTCPE VSAFAGAQRT SGSEQPSDPL NSPSLLAL // _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
