Ahh, I suppose that is the "problem" referred to in the wiki? I checked out successfully from the repository on github.
-da On Thu, Oct 28, 2010 at 16:45, Daniel Asarnow <[email protected]> wrote: > It's not a big deal - after all if you use CA only, chains with no > CA's aren't important, and the error messages aren't that long. But > I'm going to switch anyway... > I'm getting the dreaded "can't read line length in file" error while > trying to checkout biojava-live/trunk, though. > > -da > > On Thu, Oct 28, 2010 at 10:28, Andreas Prlic <[email protected]> wrote: >> Hi Daniel, >> >> I just checked, this is a bug which is already resolved in 3.0... If >> it is an issue for you, you might want to upgrade... (should be very >> easy, if you start using Maven ...) >> >> Thanks, >> Andreas >> >> On Wed, Oct 27, 2010 at 9:04 PM, Daniel Asarnow <[email protected]> wrote: >>> I'm using 1.7, partially because my distro had a package for it and >>> partially because I was initially using the online Javadoc a lot. >>> PDB ID 1a02 with CA only parses but gives 4 StructureExceptions; I've >>> pasted them below. Chain A exists in the PDB but is DNA, polypeptide >>> chain F appears to parse correctly. >>> >>> -da >>> >>> org.biojava.bio.structure.StructureException: could not find chain A >>> at >>> org.biojava.bio.structure.StructureImpl.findChain(StructureImpl.java:217) >>> at >>> org.biojava.bio.structure.StructureImpl.findChain(StructureImpl.java:223) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.linkChains2Compound(PDBFileParser.java:2303) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2210) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2107) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:1963) >>> at >>> org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:452) >>> at fragalign.pair.getStructs(pair.java:42) >>> at fragalign.Main.main(Main.java:40) >>> org.biojava.bio.structure.StructureException: could not find chain B >>> at >>> org.biojava.bio.structure.StructureImpl.findChain(StructureImpl.java:217) >>> at >>> org.biojava.bio.structure.StructureImpl.findChain(StructureImpl.java:223) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.linkChains2Compound(PDBFileParser.java:2303) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2210) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2107) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:1963) >>> at >>> org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:452) >>> at fragalign.pair.getStructs(pair.java:42) >>> at fragalign.Main.main(Main.java:40) >>> org.biojava.bio.structure.StructureException: did not find chain with >>> chainId >A< >>> at >>> org.biojava.bio.structure.StructureImpl.getChainByPDB(StructureImpl.java:541) >>> at >>> org.biojava.bio.structure.StructureImpl.getChainByPDB(StructureImpl.java:548) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.linkChains2Compound(PDBFileParser.java:2340) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2210) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2107) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:1963) >>> at >>> org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:452) >>> at fragalign.pair.getStructs(pair.java:42) >>> at fragalign.Main.main(Main.java:40) >>> org.biojava.bio.structure.StructureException: did not find chain with >>> chainId >B< >>> at >>> org.biojava.bio.structure.StructureImpl.getChainByPDB(StructureImpl.java:541) >>> at >>> org.biojava.bio.structure.StructureImpl.getChainByPDB(StructureImpl.java:548) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.linkChains2Compound(PDBFileParser.java:2340) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2210) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2107) >>> at >>> org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:1963) >>> at >>> org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:452) >>> at fragalign.pair.getStructs(pair.java:42) >>> at fragalign.Main.main(Main.java:40) >>> >>> >>> On Wed, Oct 27, 2010 at 17:47, Andreas Prlic <[email protected]> wrote: >>>>> I assume AtomCache is a new class in BioJava3? >>>> >>>> yes it is... http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0 >>>> >>>>> >>>>> I must give you my embarrassed apology...after a bunch of testing I >>>>> finally figured out that I had misunderstood where the Parser's error >>>>> handling returns control and started going after the wrong exceptions. >>>>> It does looks like if setParseCAOnly is true, the reader excepts on >>>>> chains with no CA's instead of just skipping them, though the other >>>>> chains are still parsed into the structure. >>>> >>>> This sounds like there might be a problem with CA only.. do you have >>>> an example ID? also: are you on biojava 1.7 or 3.0 ? >>>> >>>> Andreas >>>> >>>> >>>> >>>>> >>>>> -da >>>>> >>>>> On Tue, Oct 26, 2010 at 22:19, Andreas Prlic <[email protected]> wrote: >>>>>> Hi Daniel, >>>>>> >>>>>> PDB files are better nowadays, due to remediation, however there are >>>>>> still issues.. >>>>>> >>>>>> it sounds like you just want to figure out how to do the try/catch >>>>>> block properly. You could do something like that: >>>>>> >>>>>> boolean splitFileOrganisation = true; >>>>>> AtomCache cache = new >>>>>> AtomCache("/path/to/your/installation/",splitFileOrganisation); >>>>>> >>>>>> String[] pdbIDs = new String[]{"4hhb", >>>>>> "1cdg","5pti","1gav", "WRONGID" }; >>>>>> >>>>>> for (String pdbID : pdbIDs){ >>>>>> >>>>>> try { >>>>>> Structure s = cache.getStructure(pdbID); >>>>>> if ( s == null) { >>>>>> System.out.println("could not >>>>>> find structure " + pdbID); >>>>>> continue; >>>>>> } >>>>>> // do something with the structure - your >>>>>> inner loop >>>>>> System.out.println(s); >>>>>> >>>>>> } catch (Exception e){ >>>>>> // something crazy happened... >>>>>> System.err.println("Can't load structure >>>>>> " + pdbID + " reason: " + >>>>>> e.getMessage()); >>>>>> e.printStackTrace(); >>>>>> } >>>>>> } >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Oct 26, 2010 at 9:59 PM, Daniel Asarnow <[email protected]> >>>>>> wrote: >>>>>>> Glad to hear it, who doesn't like support or clean interfaces?. No >>>>>>> offense intended, by the way, with respect to PDB errors - obviously >>>>>>> the PDB is an indispensable resource for all protein scientists. >>>>>>> >>>>>>> I am looking at many (fixed-length) pieces of protein chains and doin' >>>>>>> stuff with 'em. My current code has a pair of nested while loops; the >>>>>>> outer iterates over PDB entries (locally rsync'd copy), parsing them >>>>>>> and the inner iterates over the pieces from each. When >>>>>>> StructureExceptions come out of my PDBFileReader object I want to >>>>>>> continue the outer loop, moving on to the next set of files without >>>>>>> executing any of the code that depends on correct StructureImpl >>>>>>> objects from the reader (database updates, the inner loop). >>>>>>> Since the reader's methods have their own try-catch blocks, a thrown >>>>>>> StructureException is stopped there and never reaches my own error >>>>>>> handling. I just need to know when those errors occur so I can skip >>>>>>> those proteins - I am presuming that the correct entries will outweigh >>>>>>> the problem ones by a significant factor and the overall data wont be >>>>>>> seriously impacted. >>>>>>> >>>>>>> -da >>>>>>> >>>>>>> On Tue, Oct 26, 2010 at 21:11, Andreas Prlic <[email protected]> wrote: >>>>>>>> Hi Daniel, >>>>>>>> >>>>>>>> can you explain a bit more what you are doing, in particular what >>>>>>>> errors you would like to deal with on your end? You should not need >>>>>>>> to worry too much about exception handling. Are there any special >>>>>>>> cases you are interested in? In this case we should support you with >>>>>>>> a clean interface rather than exception handling from your end... >>>>>>>> >>>>>>>> Andreas >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Oct 26, 2010 at 8:54 PM, Daniel Asarnow <[email protected]> >>>>>>>> wrote: >>>>>>>>> Hi all, >>>>>>>>> Let me first say thanks to all the BioJava community members for >>>>>>>>> delivering such a useful set of libraries, and that I'm still a newbie >>>>>>>>> when it comes to BioJava (and Java) so forgive me if my question is >>>>>>>>> too trivial. >>>>>>>>> >>>>>>>>> I am doing work on lots (at least thousands) of PDB files from RCSB. >>>>>>>>> As is commonly known, these are often rife with errors which can lead >>>>>>>>> to exceptions during parsing with PDBFileParser. Because >>>>>>>>> PDBFileParser's methods contain their own try-catch blocks, exception >>>>>>>>> propagation stops there and my code proceeds blindly along regardless >>>>>>>>> of any error checking I do. I would like to catch the exceptions up >>>>>>>>> in my code where the parser is called, so that I can branch to a >>>>>>>>> continue statement and have my batch processing loops move on to the >>>>>>>>> next file. >>>>>>>>> Should I edit out the try-catch blocks and compile my own version of >>>>>>>>> the library? Or should I test the returned StructureImpl objects for >>>>>>>>> possession of the fields in question? In that case, I'm not sure >>>>>>>>> which properties will give the most general success information...and >>>>>>>>> I'd rather not have to check for /every/ property being correct. >>>>>>>>> >>>>>>>>> If there is some great way to check if an exception was caught down a >>>>>>>>> series of nested method calls, please hit me over the head with it. >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> -da >>>>>>>>> _______________________________________________ >>>>>>>>> Biojava-l mailing list - [email protected] >>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> ----------------------------------------------------------------------- >>>> Dr. Andreas Prlic >>>> Senior Scientist, RCSB PDB Protein Data Bank >>>> University of California, San Diego >>>> (+1) 858.246.0526 >>>> ----------------------------------------------------------------------- >>>> >>> >> >> >> >> -- >> ----------------------------------------------------------------------- >> Dr. Andreas Prlic >> Senior Scientist, RCSB PDB Protein Data Bank >> University of California, San Diego >> (+1) 858.246.0526 >> ----------------------------------------------------------------------- >> > _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
