Miguel,
The algorithm, as I figure it, is as follows. This is a hypothesis based on the contents of pdb-extract's iupac_atom_name_global.h
From PDB to mmCIF:
ALA-VAL as part of chains are:
a) C,N,O,S all same as as PDB--no change b) Hydrogens renamed as follows:
HT becomes H
nHXm becomes HXmp
where
p=n when this is a CH3 carbon:
ALA-HB
ILE-HG1,HD1
LEU-HD1,HD2
LYS-HZ
MET-HE
THR-HG2
VAL-HG1,HG2
or p=(n+1) when this is CH2 carbon (all others).
The rationale seems to be that you are counting substituents, not H atoms, so for -CH2R, R is #1, the first H is #2, the second H is #3 instead of the old 1H,2H.
A,C,T,G,I:
OnP becomes OPn
2HO* becomes HO2'
1X* becomes X'
2X* becomes X''
"5M" becomes "7"
nXm becomes Xmn
* becomes 'in that order.
Then, in filterlib-v8\include\_atom_change_global.h we have seven fundamental models which are mapped to more intresting groups such as DAR, DAS, DCY.
Looks to me like this is a reverse-translation table, going BACK to PDB from three or more variants of mmCIF, where HXmp, pHXm, or mpHX (CIF?) all map back to nHXm. In this set, the CIF single quote is recorded as *. Also shown in this file are back-mappings from CIF to PDB for the monomeric amino acids, which would indicate that there have been some additional differences in naming of those in CIF format. For example, in alanine, which in its monomer state might have an H on the O or three Hs on N, looks like "H" (specifically, just that, on the O?, PDB) has become "HN", and "1H/2H/3H" has become one of "HT1/HT2/HT3" or "HN1/HN2/HN3" or "1HT/2HT/3HT" or "H1/H2/H3" (three of these--must be on N). Yow! Likewise, arginine "HH" has become, alternatively, "HH" or "HN", with numbers every which-way. Lots more there. (I think there could be an error in this file regarding H5 becoming H5M, since in the other file H5M becomes H7.)
I see from mmCIF output from the RCSB that BOTH the IUPAC and PDB names ("auth") are there. In actuality, it would appear that different authors have different conventions in PDB names (and maybe even CIF names), but from the above it also appears that even CIF name formats have evolved or are somewhat nonstandard.
_atom_site.group_PDB _atom_site.id _atom_site.type_symbol _atom_site.label_atom_id _atom_site.label_alt_id _atom_site.label_comp_id _atom_site.label_asym_id _atom_site.label_entity_id _atom_site.label_seq_id _atom_site.pdbx_PDB_ins_code _atom_site.Cartn_x _atom_site.Cartn_y _atom_site.Cartn_z ... _atom_site.auth_seq_id _atom_site.auth_comp_id _atom_site.auth_asym_id _atom_site.auth_atom_id _atom_site.pdbx_PDB_model_num
ATOM 1 N N . LEU A 1 1 ... 3 LEU A N 1 ATOM 2 C CA . LEU A 1 1 ... 3 LEU A CA 1 ATOM 3 C C . LEU A 1 1 ... 3 LEU A C 1 ... ATOM 13 H HB2 . LEU A 1 1 ... 3 LEU A 1HB 1 ATOM 14 H HB3 . LEU A 1 1 ... 3 LEU A 2HB 1
So that translation is a snap, I think. What other files are you interested in reading or writing?
Bob
--
Robert M. Hanson, [EMAIL PROTECTED], 507-646-3107 Professor of Chemistry, St. Olaf College 1520 St. Olaf Ave., Northfield, MN 55057 mailto:[EMAIL PROTECTED] http://www.stolaf.edu/people/hansonr
"Imagination is more important than knowledge." - Albert Einstein
------------------------------------------------------- This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND _______________________________________________ Jmol-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jmol-developers
