Miguel,

The algorithm, as I figure it, is as follows. This is a hypothesis based on the contents of pdb-extract's iupac_atom_name_global.h

From PDB to mmCIF:

ALA-VAL as part of chains are:

a) C,N,O,S all same as as PDB--no change
b) Hydrogens renamed as follows:

   HT becomes H

   nHXm becomes HXmp

   where

        p=n when this is a CH3 carbon:

                ALA-HB
                ILE-HG1,HD1
                LEU-HD1,HD2
                LYS-HZ
                MET-HE
                THR-HG2
                VAL-HG1,HG2


or p=(n+1) when this is CH2 carbon (all others).

The rationale seems to be that you are counting substituents,
not H atoms, so for -CH2R, R is #1, the first H is #2, the second H is #3
instead of the old 1H,2H.

A,C,T,G,I:

        OnP becomes OPn
        2HO* becomes HO2'
        1X* becomes X'
        2X* becomes X''
        "5M" becomes "7"
        nXm becomes Xmn
        * becomes '

in that order.

Then, in filterlib-v8\include\_atom_change_global.h we have seven fundamental models which are mapped to more intresting groups such as DAR, DAS, DCY.

Looks to me like this is a reverse-translation table, going BACK to PDB from three or more variants of mmCIF, where HXmp, pHXm, or mpHX (CIF?) all map back to nHXm. In this set, the CIF single quote is recorded as *. Also shown in this file are back-mappings from CIF to PDB for the monomeric amino acids, which would indicate that there have been some additional differences in naming of those in CIF format. For example, in alanine, which in its monomer state might have an H on the O or three Hs on N, looks like "H" (specifically, just that, on the O?, PDB) has become "HN", and "1H/2H/3H" has become one of "HT1/HT2/HT3" or "HN1/HN2/HN3" or "1HT/2HT/3HT" or "H1/H2/H3" (three of these--must be on N). Yow! Likewise, arginine "HH" has become, alternatively, "HH" or "HN", with numbers every which-way. Lots more there. (I think there could be an error in this file regarding H5 becoming H5M, since in the other file H5M becomes H7.)

I see from mmCIF output from the RCSB that BOTH the IUPAC and PDB names ("auth") are there. In actuality, it would appear that different authors have different conventions in PDB names (and maybe even CIF names), but from the above it also appears that even CIF name formats have evolved or are somewhat nonstandard.

_atom_site.group_PDB
_atom_site.id
_atom_site.type_symbol
_atom_site.label_atom_id
_atom_site.label_alt_id
_atom_site.label_comp_id
_atom_site.label_asym_id
_atom_site.label_entity_id
_atom_site.label_seq_id
_atom_site.pdbx_PDB_ins_code
_atom_site.Cartn_x
_atom_site.Cartn_y
_atom_site.Cartn_z
...
_atom_site.auth_seq_id
_atom_site.auth_comp_id
_atom_site.auth_asym_id
_atom_site.auth_atom_id
_atom_site.pdbx_PDB_model_num

ATOM 1     N N     . LEU A 1 1  ... 3  LEU A N    1
ATOM 2     C CA    . LEU A 1 1  ... 3  LEU A CA   1
ATOM 3     C C     . LEU A 1 1  ... 3  LEU A C    1
...
ATOM 13    H HB2   . LEU A 1 1  ... 3  LEU A 1HB  1
ATOM 14    H HB3   . LEU A 1 1  ... 3  LEU A 2HB  1



So that translation is a snap, I think. What other files are you interested in reading or writing?

Bob

--

Robert M. Hanson, [EMAIL PROTECTED], 507-646-3107
Professor of Chemistry, St. Olaf College
1520 St. Olaf Ave., Northfield, MN 55057
mailto:[EMAIL PROTECTED]
http://www.stolaf.edu/people/hansonr

"Imagination is more important than knowledge."  - Albert Einstein



-------------------------------------------------------
This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference
Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer
Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA
REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND
_______________________________________________
Jmol-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jmol-developers

Reply via email to