for the record, here's that PDB algorithm, which starts with three inputs:
the atom name, the residue code, and whether or not this is a HETATM record
(as determined in a PDB file from the record type or from a mol2 file based
on not matching known ATOM residue types). This algorithm was developed by
looking at all known ligand and standard atom names in the PDB database.

1) If the atom name is the same as the residue code, return the atom name.
This takes care of CA/CA, MG/MG, etc.
2) From the atom name, remove all leading numerical digits. From what is
left, define two characters, ch1 (a letter) and ch2 (a character or space).
3) Create a string consisting of the residue code, ".", ch1, and ch2.
4) If this string is one of "OEC.CA ICA.CA OC1.CA OC2.CA OC4.CA", return
"Ca"
5) If the name includes a prime (') or an asterisk (*), return ch1.
6) If ch1 is one of H, C, N, or O and ch2 is anything less than or equal to
"H",
    return ch1.
7) If the atom name starts with "CM", return "C"
8) If this is a HETATM entry and ch1 or ch1+ch2 is a valid element symbol,
    return that element symbol.
9) If ch1 is a valid element symbol, return that.
10) If ch2 is a valid element symbol, return that.
11) Give up. Return "Xx".



Bob

On Thu, Oct 30, 2008 at 1:26 PM, Robert Hanson <[EMAIL PROTECTED]> wrote:

> Nico, I just finished with 11.7.6.
>
> For the record, here's how I implemented atom typing in Mol2Reader.java:
>
> 1) Using the information in
> http://www.chem.cmu.edu/courses/09-560/docs/msi/ffbsim/B_AtomTypes.html, I
> created a list of all designated force field atom types [1].
> 2) An atom type in field 5 is checked.
>   If it is a single letter or an Uppercase+lowercase combination, then it
> is considered an element symbol.
>   If it is a single upper case letter or Upper+lower combination followed
> by 0, then the characters prior to 0
>     are considered to be an atom type (ESFF metal type)
>   If it is one of the designated atom types, then the first character is
> used as the atom type unless:
>     a) it is on the "twoChar" list [2], in which case the first two
> characters are used, or
>     b) it is on the "secondCharOnly" list (" AH BH AC BC "), in which case
> the second character is used, or
>     c) it is on the "specialTypes" list (" sz az sy ay ayt "), in which
> case something special is done (zeolite Si and Al)
>
> 3)  If none of this sets the element symbol, and the file is found to have
> a known PDB residue code in field 7,  then the atom type is determined by a
> relatively complex algorithm that uses the presumed PDB atom name in field 5
> and the PDB residue code in field 7.
>
> Bob
>
>
> [1]  private final static String ffTypes =
>       /* AMBER   */ " C* C2 C3 CA CB CC CD CE CF CG CH CI CJ CK CM CN CP CQ
> CR CT CV CW H2 H3 HC HO HS HW LP N* N2 N3 NA NB NC NT O2 OH OS OW SH AH BH
> HT HY AC BC CS OA OB OE OT "
>     + /* CFF     */ " dw hc hi hn ho hp hs hw h* h+ hscp htip ca cg ci cn
> co coh cp cr cs ct c1 c2 c3 c5 c3h c3m c4h c4m c' c\" c* c- c+ c= c=1 c=2 na
> nb nh nho nh+ ni nn np npc nr nt nz n1 n2 n3 n4 n3m n3n n4m n4n n+ n= n=1
> n=2 oc oe oh op o3e o4e o' o* o- oscp otip sc sh sp s1 s3e s4e s' s- br cl f
> i ca+ ar si lp nu sz oz az pz ga ge tioc titd li+ na+ k+ rb+ cs+ mg2+ ca2+
> ba2+ cu2+ f- cl- br- i- so4 sy oy ay ayt nac+ mg2c fe2c mn4c mn3c co2c ni2c
> lic+ pd2+ ti4c sr2c ca2c cly- hocl py vy nh4+ so4y lioh naoh koh foh cloh
> beoh al "
>     + /* CHARMM  */ " C3 C4 C5R C5RE C5RP C5RQ C6R C6RE C6RP C6RQ CE1 CF1
> CF2 CF3 CG CD2 CH1E CH2E CH3E CM CP3 CPH1 CPH2 CQ66 CR55 CR56 CR66 CS66 CT
> CT3 CT4 CUA1 CUA2 CUA3 CUY1 CUY2 HA HC HMU HO HT LP N3 N5R N5RP N6R N6RP NC
> NC2 NO2 NP NR1 NR2 NR3 NR55 NR56 NR66 NT NX O2M O5R O6R OA OAC OC OE OH2 OK
> OM OS OSH OSI OT OW P6R PO3 PO4 PT PUA1 PUY1 S5R S6R SE SH1E SK SO1 SO2 SO3
> SO4 ST ST2 ST2 "
>     + /* COMPASS */ " h1 h1+ h1h h1n h1o c c1o c2= c2t c3 c3\" c3# c3' c3-
> c3= c3a c3n c3o c4 c43 c44 c4o c4x n1n n1o n1t n2= n2a n2t n3 n3* n3+ n3a
> n3h1 n3h2 n3m n3mh n3o n4+ n4o o-2 o1- o12 o1= o1=* o1c o1n o1o o2 o2* o2a
> o2b o2c o2e o2h o2s o2z o3 o3z s1= s2 s2= s2a s3= s4 s4= s6 p4= br br- br1
> cl cl- cl1 cl12 cl13 cl14 cl1p f f- f1 f12 f13 f14 f1p i i- i1 ca+ cu+2 fe+2
> mg+2 zn+2 cs+ k+ li+ na+ rb+ al4z si si4 si4c si4z ar he kr ne xe "
>     + /* ESFF    */ " dw hi hw h* h+ ca cg ci co coh cp cr cs ct ct3 c1 c2
> c3 c5 c5p c' c- c+ c= na nb nh nho ni no np nt nt2 nz n1 n2 n4 n+ n= oa oc
> oh op os ot o1 o' o* o- sp s1 s2d s3d s4d s4l s5l s5t s6 s6o s' s- p4d p4l
> p5l p5t p53 p6 p6o p' bt b' Be+ Be+2 Li+ cl' Mg+ Mg+2 Na+ si4l si5l si5t si6
> si6o si' "
>     + /* GAFF    */ " br c1 c2 c3 ca cc cd ce cf cl cp cq cu cv cx cy f h1
> h2 h3 h4 h5 ha hc hn ho hp hs n1 n2 n4 na nb nc nd nh oh os p2 p3 p4 p5 pb
> pc pd pe pf px py s2 s4 s6 sh ss sx sy "
>     + /* PCFF    */ " hn2 ho2 c_0 c_1 c_2 cz o= o_1 o_2 oo oz p= si sio hsi
> osi ";
>
>
> [2]
>   private final static String twoChar = " LP ca+ br cl ar si lp nu br- br1
> cl- cl1 cl12 cl13 cl14 cl1p cu+2 fe+2 mg+2 zn+2 cs+ li+ na+ rb+ al4z si si4
> si4c si4z he kr ne xe ga ge tioc titd li+ na+ rb+ cs+ mg2+ ca2+ ba2+ cu2+
> nac+ mg2c fe2c mn4c mn3c co2c ni2c lic+ pd2+ ti4c sr2c ca2c cly- lioh naoh
> cloh beoh al Be+ Be+2 Li+ cl' Mg+ Mg+2 Na+ si4l si5l si5t si6 si6o si' sio
> ";
>
>
>
>
> --
> Robert M. Hanson
> Professor of Chemistry
> St. Olaf College
> 1520 St. Olaf Ave.
> Northfield, MN 55057
> http://www.stolaf.edu/people/hansonr
> phone: 507-786-3107
>
>
> If nature does not answer first what we want,
> it is better to take what answer we get.
>
> -- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
>



-- 
Robert M. Hanson
Professor of Chemistry
St. Olaf College
1520 St. Olaf Ave.
Northfield, MN 55057
http://www.stolaf.edu/people/hansonr
phone: 507-786-3107


If nature does not answer first what we want,
it is better to take what answer we get.

-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Jmol-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jmol-developers

Reply via email to