Hi Daniel, The chemical components provide the chemically correct definition of the various groups. There are quite a few chemically modified amino acids in PDB files which can be represented as amino acids, rather than Hetatom groups, based on these definitions. This has an impact on the sequence alignment that is done during the alignSeqRes process. Without the correct representations, those groups would be flagged as "X", or might be missing. Will set up a wiki page which explains all the parsing options in detail.
About your 2nd comment. There is now a new ChemCompProvider interface. Perhaps there should be a new implementation of this, which is downloading the file that contains all chemcomps bundled and provide the data from there... Andreas On Sat, Dec 18, 2010 at 7:36 PM, Daniel Asarnow <[email protected]> wrote: > Prior to this, did setting loadChemComp true add processing overhead if > setAlignSeqRes is also true? I.e. what's the difference between AlignSeqRes > with and without loadChemComp? > Just want to know what the right flags are when one wants accurate SEQRES > <---> ATOM alignments but isn't otherwise using the components info... > On a related note, I ended up writing a class that loaded and discarded > Structure objects for my PDBs, to trigger all the downloads before my big > processing jobs. Though I guess the right (non-lazy) thing to do is parse > out the combined components library to the individual files. > -da > > On Fri, Dec 17, 2010 at 07:24, Andreas Prlic <[email protected]> wrote: >> >> ok that behavior is fixed in SVN now. Now you can have setAlignSeqRes >> set to true and it will not download chemical components if >> loadChemComp is false. The drawback is that the data representation >> will not be as precise. >> >> Andreas >> >> >> >> On Thu, Dec 16, 2010 at 8:26 AM, Steve Darnell <[email protected]> >> wrote: >> > The SeqRes to Atom record alignment forces the use of chemical >> > components to translate non-standard residues to their closest standard >> > counterpart for the sequence alignment. I have to disable >> > setLoadChemCompInfo and setAlignSeqRes when I don't want to download >> > chemical component files from RCSB when parsing a PDB file. >> > >> > Regards, >> > Steve >> > >> > -----Original Message----- >> > From: [email protected] >> > [mailto:[email protected]] On Behalf Of Fico >> > Sent: Wednesday, December 15, 2010 8:46 PM >> > To: [email protected] >> > Subject: [Biojava-l] how to cancel download chemcomp when parser a PDB >> > file >> > >> > Hi, dear all: >> > >> > I use biojava3 beta1 to parse the PDB files recently, my program is: >> > >> > PDBFileReader pdbreader = new PDBFileReader(); >> > pdbreader.setAutoFetch(false); >> > pdbreader.setPath(pdbDirPath); >> > >> > FileParsingParameters params = new FileParsingParameters(); >> > params.setLoadChemCompInfo(*false*); >> > params.setHeaderOnly(*false*); >> > params.setAlignSeqRes(*true*); >> > params.setParseSecStruc(*false*); >> > pdbreader.setFileParsingParameters(params); >> > >> > Structure structure = null; >> > try { >> > structure = pdbreader.getStructure(pdbDirPath + "\\" + >> > file); >> > } catch (IOException e) { >> > e.printStackTrace(); >> > } >> > >> > when I execute this program, it will download something such as: >> > >> > *creating directory D:\MyWorkspace\TestFiles\pdbFiles\chemcomp >> > downloading http://www.rcsb.org/pdb/files/ligand/35G.cif >> > downloading http://www.rcsb.org/pdb/files/ligand/GDP.cif* >> > >> > but I do not want to lownload those stuff, How can I cancel it? >> > Thanks. >> > _______________________________________________ >> > Biojava-l mailing list - [email protected] >> > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > >> > _______________________________________________ >> > Biojava-l mailing list - [email protected] >> > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > >> >> >> >> -- _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
