Yes, I would like to have direct access to the element symbol data that's in the file. Otherwise, anyone that needs the element type has to create rules for interpreting it from the "atom name" field. It feels wrong to attempt to deduce data when it is provided explicitly.
These PDB remediation project notes suggest using the element symbol specified in 77-78 http://nar.oxfordjournals.org/cgi/content/full/36/suppl_1/D426#SEC3 "Atom types are provided for every atom (i.e. ATOM record columns 77-78), so prior atom name justification conventions should no longer be assumed in reading atom names." JMOL uses the PDB element symbol if present, else interprets from the "atom name" field. http://wiki.jmol.org/index.php/AtomSets "On PDB format, Jmol will identify the element from columns 77-78 (element symbol, right-justified). If this is absent, then it will interpret the "atom name" field (columns 13-14) to deduce the element identity." JMOL is LGPL. If it interpretation is desirable, could start with its current approach. Personally, I would be happy just with access to the data in the file. ________________________________ From: [email protected] [mailto:[email protected]] On Behalf Of Andreas Prlic Sent: Monday, April 26, 2010 8:08 PM To: Andy Thomas-Cramer Cc: [email protected] Subject: Re: [Biojava-l] PDBFileParser and Atom element symbol Hi Andy Questions: * Is this pattern documented in the PDB specification? see here: http://www.wwpdb.org/documentation/format23/sect9.html#ATOM * If this pattern can be relied on, why are columns 77-78 also dedicated to the element symbol? That is the atom's element symbol (as given in the periodic table), in contrast to the first name, which contains numbering information. * Should reliance on the pattern be hidden behind a BioJava method? If you think that is important we could probably provide an enum for all atom types. There are two categories though: the periodic table symbol and the one that is related to the position in an amino acid.... Andreas ________________________________ From: [email protected] [mailto:[email protected]] On Behalf Of Andreas Prlic Sent: Friday, April 23, 2010 6:52 PM To: Andy Thomas-Cramer Cc: [email protected] Subject: Re: [Biojava-l] PDBFileParser and Atom element symbol Hi Andy, you could check with Atom.getFullname(), which contains the space characters from the PDB file: e.g Calpha: " CA ", Calcium "CA " in addition the parent group of a Calpha atom is usually an AminoAcid and for Calciums it is a Hetatom group... Andreas On Fri, Apr 23, 2010 at 3:58 PM, Andy Thomas-Cramer <[email protected]> wrote: Is there an easy way to identify the type of atom referenced by an Atom object? For example, if Atom.getName() is "CA", is the element calcium or the atom carbon alpha? If not, would it be feasible to add a method providing this in Atom, AtomImpl, and parsing it in PDBFileParser, using the columns defined at http://www.wwpdb.org/documentation/format32/sect9.html#ATOM? _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
