George - That looks right to me. I don't have a developer account, so it would be great if you could check that in.
Thanks! Scott On Fri, Feb 17, 2012 at 12:56 PM, George Waldon <[email protected]> wrote: > Hi Scott, > > Yes, well done. You need to fix rettype too. So, if I have it correct, we > should uncomment and have: > > rettype = "gb" > retmode = "txt" > > and existing code should not be broken. What do you think? I can commit if > you do not have a developer account. > > Thanks, > - George > > Quoting Scott Frees <[email protected]>: > >> George - Thanks for your response. >> >> I think I tracked down the problem. When building the FetchURL, >> GenbankRichSequenceDB uses "genbank" as the db. In the >> org.biojava.bio.seq.db.FetchURL constructor, rettype and retmode are >> specifically not set when given "genbank" - see lines 54-55 commented >> out. >> >> // rettype = format; >> // retmode = format; >> >> Entrez recently updated their API >> (http://www.ncbi.nlm.nih.gov/books/NBK25501/) on Wednesday and in the >> release notes they say they've set defaults on each database for >> retmode. I'm new to biojava and entrez, but I can only assume that >> the "genbank" db used to return sequences as text always, which is why >> FetchURL doesn't include the parameter in the URL it builds. It looks >> like the default now is XML - which breaks the GenbankRichSequenceDB >> parser. >> >> I proved it out by subclassing GenbankRichSequenceDB to set the >> retmode parameter as text, and the problem is resolved. >> >> @Override >> protected URL getAddress(String id) throws MalformedURLException { >> FetchURL seqURL = new FetchURL("Genbank", "text"); >> String baseurl = seqURL.getbaseURL(); >> String db = seqURL.getDB(); >> // added retmode=text >> String url = >> >> baseurl+db+"&id="+id+"&rettype=gb&retmode=text&tool="+getTool()+"&email="+getEmail(); >> return new URL(url); >> } >> >> I think a more elegant solution would be to simply fix FetchURL to use >> the retmode parameter >> >> Regards - >> Scott >> >> On Thu, Feb 16, 2012 at 8:53 PM, George Waldon <[email protected]> >> wrote: >>> >>> Hello Scott, >>> >>> This appears to be an exception thrown by the parser. Is-there a way you >>> can >>> fetch the sequence(s) as a text file before the exception occurs? It >>> would >>> be interesting to see if you can reproduce the exception; you can send me >>> the file if you want. >>> >>> Regards, >>> George >>> >>> Quoting Scott Frees <[email protected]>: >>> >>>> Hello - >>>> >>>> I have developed an application that searches and compares >>>> g-quadruplexes within mRNA. The web application has been running >>>> >>>> without any problems on several different web servers for over a year. >>>> Suddenly, just this week, it is unable to download sequence data >>>> >>>> using GenbankRichSequenceDB - has anyone else has had this problem? >>>> >>>> We are using BioJava 1.8.1 >>>> >>>> Below is the exception trace, and the code that follows is a small >>>> test app that generates the exception. This code worked without >>>> any >>>> >>>> problems prior to Tuesday this week, and we haven't made any >>>> modification to our application. >>>> ------------------------------------------------------ >>>> org.biojava.bio.BioException: Failed to read Genbank sequence >>>> at >>>> >>>> org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:163) >>>> at Tester.main(Tester.java:11) >>>> >>>> Caused by: org.biojava.bio.BioException: Could not read sequence >>>> at >>>> >>>> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) >>>> at >>>> >>>> org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:159) >>>> ... 1 more >>>> >>>> Caused by: org.biojava.bio.seq.io.ParseException: >>>> >>>> A Exception Has Occurred During Parsing. >>>> Please submit the details that follow to [email protected] or post >>>> a bug report to http://bugzilla.open-bio.org/ >>>> >>>> Format_object=org.biojavax.bio.seq.io.GenbankFormat >>>> Accession=null >>>> Id=null >>>> Comments=Bad section >>>> Parse_block=<?xml version="1.0"?> >>>> Stack trace follows .... >>>> >>>> at >>>> >>>> org.biojavax.bio.seq.io.GenbankFormat.readSection(GenbankFormat.java:620) >>>> at >>>> >>>> org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:279) >>>> at >>>> >>>> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) >>>> ... 2 more >>>> >>>> Caused by: java.lang.StringIndexOutOfBoundsException: String index out >>>> of range: -4 >>>> at java.lang.String.substring(Unknown Source) >>>> at java.lang.String.substring(Unknown Source) >>>> at >>>> >>>> org.biojavax.bio.seq.io.GenbankFormat.readSection(GenbankFormat.java:610) >>>> ... 4 more >>>> >>>> ----------------------------- >>>> >>>> >>>> import org.biojava.bio.BioException; >>>> import org.biojava.bio.seq.db.IllegalIDException; >>>> import org.biojavax.bio.db.ncbi.GenbankRichSequenceDB; >>>> import org.biojavax.bio.seq.RichSequence; >>>> >>>> public class Tester { >>>> public static void main(String args[]) { >>>> String id = >>>> "NM_001110.2"; // Issue occurs with any ID >>>> >>>> GenbankRichSequenceDB ncbi = new GenbankRichSequenceDB(); >>>> try { >>>> >>>> RichSequence rs = ncbi.getRichSequence(id); >>>> >>>> System.out.println(rs.seqString()); >>>> } catch >>>> (IllegalIDException e) { >>>> >>>> e.printStackTrace(); >>>> } catch >>>> (BioException e) { >>>> >>>> e.printStackTrace(); >>>> } >>>> } >>>> } >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list - [email protected] >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >>> >>> >>> >>> -------------------------------- >>> George Waldon >>> >>> >> > > > > -------------------------------- > George Waldon > > _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
