Okay, this has already been fixed for RefSeq files. The demo TestRefSeqPrt has the code for parsing these files, and I've attached the relevant snippet. "**" marks what should change. Let me know if you run into any problems with RefSeq; you get to be tester #1. Greg SequenceFormat gFormat = new GenbankFormat(); BufferedReader gReader = new BufferedReader( new InputStreamReader(new FileInputStream(genbankFile))); ** SequenceBuilderFactory sbFact = ** new ProteinRefSeqProcessor.Factory(SimpleSequenceBuilder.FACTORY); Alphabet alpha = ProteinTools.getTAlphabet(); SymbolParser rParser = alpha.getParser("token"); SequenceIterator seqI = new StreamReader(gReader, gFormat, rParser, sbFact); > -----Original Message----- > From: Cox, Greg [mailto:[EMAIL PROTECTED]] > Sent: Monday, July 23, 2001 8:06 AM > To: '[EMAIL PROTECTED]'; [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: RE: [Biojava-l] need help for SimpleSequenceBuilder class > > > Hi Bruce, > I've done a lot with Genbank files, and the problem isn't > actually in > SimpleSequenceBuilder, that's just the symptom. The feature > table renderer > builds a stranded feature by default, and that's not acceptable for > proteins. I'll look into your case, and try to get a fix > into CVS later > today. > > Greg > > -----Original Message----- > From: Bruce Ling [mailto:[EMAIL PROTECTED]] > Sent: Sunday, July 22, 2001 11:17 AM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: [Biojava-l] need help for SimpleSequenceBuilder class > > > Hi, Thomas, > > As I saw the doc says you are the author of > SimpleSequenceBuilder class, I > am asking for help with the following problem? > > I am in the way of using biojava GenbankFormat class, the code is as > following: > > { > SequenceFormat gFormat = new GenbankFormat(); > SequenceBuilderFactory sbFact = > new GenbankProcessor.Factory(SimpleSequenceBuilder.FACTORY); > //Alphabet alpha = DNATools.getDNA(); > //this following line does not work for protein, need more > work to figure > out the library > Alphabet alpha = ProteinTools.getAlphabet(); > SymbolParser rParser = alpha.getParser("token"); > seqI = > new StreamReader(gReader, gFormat, rParser, sbFact); > > } > > see the commented out part, if I am using a DNA genbank file > as the one > sample in the demo part it works fine. But if I want to use > the above code > to use PROTEIN alphabet and parse a protein record in genbank > format such > as: > http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=NP_005154 > <http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=NP_00 > 5154&form=6&db > =p&Dopt=g> &form=6&db=p&Dopt=g > > it gives the exception shown at the end of the email. > > I have traced down and problem is at: > SimpleSequenceBuilder class TemplateWithChildren. It seems > by default it > assumes this is a DNA genbank record. that is why it is > trying to create a > strand feature which protein record does not have it. > > public Sequence makeSequence() { > SymbolList symbols = slBuilder.makeSymbolList(); > Sequence seq = new SimpleSequence(symbols, uri, name, annotation); > try { > for (Iterator i = rootFeatures.iterator(); i.hasNext(); ) { > TemplateWithChildren twc = (TemplateWithChildren) i.next(); > Feature f = seq.createFeature(twc.template); > if (twc.children != null) { > makeChildFeatures(f, twc.children); > } > } > } catch (Exception ex) { > throw new BioError(ex, "Couldn't create feature"); > } > return seq; > } > > ================================== > java Exceptions > ================================== > java.lang.reflect.InvocationTargetException: > org.biojava.bio.symbol.IllegalAlphabetException: Can not > create a stranded > feature within a sequence of type PROTEIN > > at > org.biojava.bio.seq.impl.SimpleStrandedFeature.<init>(SimpleSt > randedFeature. > java:76) > > at java.lang.reflect.Constructor.newInstance(Native Method) > > at > org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize > (SimpleFeature > Realizer.java:136) > > rethrown as org.biojava.bio.BioException: Couldn't realize feature > > at > org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize > (SimpleFeature > Realizer.java:138) > > at > org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(Simpl > eFeatureRealiz > er.java:92) > > at > org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleS > equence.java:1 > 76) > > at > org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSe > quence.java:18 > 2) > > at > org.biojava.bio.seq.io.SimpleSequenceBuilder.makeSequence(Simp > leSequenceBuil > der.java:154) > > rethrown as org.biojava.bio.BioError: Couldn't create feature > > at > org.biojava.bio.seq.io.SimpleSequenceBuilder.makeSequence(Simp > leSequenceBuil > der.java:160) > > at > org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(Sequ > enceBuilderFil > ter.java:98) > > at > org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader. > java:100) > > > > > > > > Thanks. > > Bruce Ling, Ph.D. > Director, Bioinformatics > Tularik, Inc -- http://www.tularik.com <http://www.tularik.com/> > Email: [EMAIL PROTECTED] > Phone: 650-825-7143 > fax: 1-435-804-4009 > > > > _______________________________________________ > Biojava-l mailing list - [EMAIL PROTECTED] > http://biojava.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l