I'm using 1.7, partially because my distro had a package for it and partially because I was initially using the online Javadoc a lot. PDB ID 1a02 with CA only parses but gives 4 StructureExceptions; I've pasted them below. Chain A exists in the PDB but is DNA, polypeptide chain F appears to parse correctly.
-da org.biojava.bio.structure.StructureException: could not find chain A at org.biojava.bio.structure.StructureImpl.findChain(StructureImpl.java:217) at org.biojava.bio.structure.StructureImpl.findChain(StructureImpl.java:223) at org.biojava.bio.structure.io.PDBFileParser.linkChains2Compound(PDBFileParser.java:2303) at org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2210) at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2107) at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:1963) at org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:452) at fragalign.pair.getStructs(pair.java:42) at fragalign.Main.main(Main.java:40) org.biojava.bio.structure.StructureException: could not find chain B at org.biojava.bio.structure.StructureImpl.findChain(StructureImpl.java:217) at org.biojava.bio.structure.StructureImpl.findChain(StructureImpl.java:223) at org.biojava.bio.structure.io.PDBFileParser.linkChains2Compound(PDBFileParser.java:2303) at org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2210) at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2107) at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:1963) at org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:452) at fragalign.pair.getStructs(pair.java:42) at fragalign.Main.main(Main.java:40) org.biojava.bio.structure.StructureException: did not find chain with chainId >A< at org.biojava.bio.structure.StructureImpl.getChainByPDB(StructureImpl.java:541) at org.biojava.bio.structure.StructureImpl.getChainByPDB(StructureImpl.java:548) at org.biojava.bio.structure.io.PDBFileParser.linkChains2Compound(PDBFileParser.java:2340) at org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2210) at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2107) at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:1963) at org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:452) at fragalign.pair.getStructs(pair.java:42) at fragalign.Main.main(Main.java:40) org.biojava.bio.structure.StructureException: did not find chain with chainId >B< at org.biojava.bio.structure.StructureImpl.getChainByPDB(StructureImpl.java:541) at org.biojava.bio.structure.StructureImpl.getChainByPDB(StructureImpl.java:548) at org.biojava.bio.structure.io.PDBFileParser.linkChains2Compound(PDBFileParser.java:2340) at org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2210) at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2107) at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:1963) at org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:452) at fragalign.pair.getStructs(pair.java:42) at fragalign.Main.main(Main.java:40) On Wed, Oct 27, 2010 at 17:47, Andreas Prlic <[email protected]> wrote: >> I assume AtomCache is a new class in BioJava3? > > yes it is... http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0 > >> >> I must give you my embarrassed apology...after a bunch of testing I >> finally figured out that I had misunderstood where the Parser's error >> handling returns control and started going after the wrong exceptions. >> It does looks like if setParseCAOnly is true, the reader excepts on >> chains with no CA's instead of just skipping them, though the other >> chains are still parsed into the structure. > > This sounds like there might be a problem with CA only.. do you have > an example ID? also: are you on biojava 1.7 or 3.0 ? > > Andreas > > > >> >> -da >> >> On Tue, Oct 26, 2010 at 22:19, Andreas Prlic <[email protected]> wrote: >>> Hi Daniel, >>> >>> PDB files are better nowadays, due to remediation, however there are >>> still issues.. >>> >>> it sounds like you just want to figure out how to do the try/catch >>> block properly. You could do something like that: >>> >>> boolean splitFileOrganisation = true; >>> AtomCache cache = new >>> AtomCache("/path/to/your/installation/",splitFileOrganisation); >>> >>> String[] pdbIDs = new String[]{"4hhb", "1cdg","5pti","1gav", >>> "WRONGID" }; >>> >>> for (String pdbID : pdbIDs){ >>> >>> try { >>> Structure s = cache.getStructure(pdbID); >>> if ( s == null) { >>> System.out.println("could not find >>> structure " + pdbID); >>> continue; >>> } >>> // do something with the structure - your >>> inner loop >>> System.out.println(s); >>> >>> } catch (Exception e){ >>> // something crazy happened... >>> System.err.println("Can't load structure " + >>> pdbID + " reason: " + >>> e.getMessage()); >>> e.printStackTrace(); >>> } >>> } >>> >>> >>> >>> >>> On Tue, Oct 26, 2010 at 9:59 PM, Daniel Asarnow <[email protected]> wrote: >>>> Glad to hear it, who doesn't like support or clean interfaces?. No >>>> offense intended, by the way, with respect to PDB errors - obviously >>>> the PDB is an indispensable resource for all protein scientists. >>>> >>>> I am looking at many (fixed-length) pieces of protein chains and doin' >>>> stuff with 'em. My current code has a pair of nested while loops; the >>>> outer iterates over PDB entries (locally rsync'd copy), parsing them >>>> and the inner iterates over the pieces from each. When >>>> StructureExceptions come out of my PDBFileReader object I want to >>>> continue the outer loop, moving on to the next set of files without >>>> executing any of the code that depends on correct StructureImpl >>>> objects from the reader (database updates, the inner loop). >>>> Since the reader's methods have their own try-catch blocks, a thrown >>>> StructureException is stopped there and never reaches my own error >>>> handling. I just need to know when those errors occur so I can skip >>>> those proteins - I am presuming that the correct entries will outweigh >>>> the problem ones by a significant factor and the overall data wont be >>>> seriously impacted. >>>> >>>> -da >>>> >>>> On Tue, Oct 26, 2010 at 21:11, Andreas Prlic <[email protected]> wrote: >>>>> Hi Daniel, >>>>> >>>>> can you explain a bit more what you are doing, in particular what >>>>> errors you would like to deal with on your end? You should not need >>>>> to worry too much about exception handling. Are there any special >>>>> cases you are interested in? In this case we should support you with >>>>> a clean interface rather than exception handling from your end... >>>>> >>>>> Andreas >>>>> >>>>> >>>>> >>>>> On Tue, Oct 26, 2010 at 8:54 PM, Daniel Asarnow <[email protected]> >>>>> wrote: >>>>>> Hi all, >>>>>> Let me first say thanks to all the BioJava community members for >>>>>> delivering such a useful set of libraries, and that I'm still a newbie >>>>>> when it comes to BioJava (and Java) so forgive me if my question is >>>>>> too trivial. >>>>>> >>>>>> I am doing work on lots (at least thousands) of PDB files from RCSB. >>>>>> As is commonly known, these are often rife with errors which can lead >>>>>> to exceptions during parsing with PDBFileParser. Because >>>>>> PDBFileParser's methods contain their own try-catch blocks, exception >>>>>> propagation stops there and my code proceeds blindly along regardless >>>>>> of any error checking I do. I would like to catch the exceptions up >>>>>> in my code where the parser is called, so that I can branch to a >>>>>> continue statement and have my batch processing loops move on to the >>>>>> next file. >>>>>> Should I edit out the try-catch blocks and compile my own version of >>>>>> the library? Or should I test the returned StructureImpl objects for >>>>>> possession of the fields in question? In that case, I'm not sure >>>>>> which properties will give the most general success information...and >>>>>> I'd rather not have to check for /every/ property being correct. >>>>>> >>>>>> If there is some great way to check if an exception was caught down a >>>>>> series of nested method calls, please hit me over the head with it. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> -da >>>>>> _______________________________________________ >>>>>> Biojava-l mailing list - [email protected] >>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>>> >>>>> >>>>> >>> >> > > > > -- > ----------------------------------------------------------------------- > Dr. Andreas Prlic > Senior Scientist, RCSB PDB Protein Data Bank > University of California, San Diego > (+1) 858.246.0526 > ----------------------------------------------------------------------- > _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
