Re: [Jmol-developers] * in .pdb <-> ' in .cif

Miguel Thu, 17 Jun 2004 02:19:05 -0700

Very good research ... but frankly it is more than I can digest in one chunk


> The algorithm, as I figure it, is as follows. This is a hypothesis based
> on the contents of pdb-extract's iupac_atom_name_global.h
>
>  From PDB to mmCIF:
>
> ALA-VAL as part of chains are:
>
> a) C,N,O,S all same as as PDB--no change
> b) Hydrogens renamed as follows:
>
>     HT becomes H
>
>     nHXm becomes HXmp
>
>     where
>
>       p=n when this is a CH3 carbon:
>
>               ALA-HB
>               ILE-HG1,HD1
>               LEU-HD1,HD2
>               LYS-HZ
>               MET-HE
>               THR-HG2
>               VAL-HG1,HG2
>
>
>       or p=(n+1) when this is CH2 carbon (all others).
>
> The rationale seems to be that you are counting substituents,
> not H atoms, so for -CH2R, R is #1, the first H is #2, the second H is #3
> instead of the old 1H,2H.

I don't understand the details here, but it generally makes sense.

But I am somewhat nervous because they are making distinctions between the
group types.

> A,C,T,G,I:
>
>       OnP becomes OPn
>       2HO* becomes HO2'
>       1X* becomes X'
>       2X* becomes X''
>       "5M" becomes "7"
>       nXm becomes Xmn
>       * becomes '
>
> in that order.

That makes sense.

> Then, in filterlib-v8\include\_atom_change_global.h  we have seven
> fundamental
> models which are mapped to more intresting groups such as DAR, DAS, DCY.

Based upon my relatively short experience working with this data, if they
are using group names then they are doomed to failure.

> Looks to me like this is a reverse-translation table, going BACK to PDB
> from
> three or more variants of mmCIF, where HXmp, pHXm, or mpHX (CIF?) all map
> back
> to nHXm. In this set, the CIF single quote is recorded as *. Also shown in
> this
> file are back-mappings from CIF to PDB for the monomeric amino acids,
> which
> would indicate that there have been some additional differences in naming
> of
> those in CIF format. For example, in alanine, which in its monomer state
> might
> have an H on the O or three Hs on N, looks like "H" (specifically, just
> that, on
> the O?, PDB) has become "HN", and "1H/2H/3H" has become one of
> "HT1/HT2/HT3" or
> "HN1/HN2/HN3" or "1HT/2HT/3HT" or "H1/H2/H3" (three of these--must be on
> N).
> Yow! Likewise, arginine "HH" has become, alternatively, "HH" or "HN", with
> numbers every which-way. Lots more there. (I think there could be an error
> in
> this file regarding H5 becoming H5M, since in the other file H5M becomes
> H7.)

I am now completely lost.

> I see from mmCIF output from the RCSB that BOTH the IUPAC and PDB names
> ("auth")
> are there. In actuality, it would appear that different authors have
> different
> conventions in PDB names (and maybe even CIF names), but from the above it
> also
> appears that even CIF name formats have evolved or are somewhat
> nonstandard.
>
> _atom_site.group_PDB
> _atom_site.id
> _atom_site.type_symbol
> _atom_site.label_atom_id
> _atom_site.label_alt_id
> _atom_site.label_comp_id
> _atom_site.label_asym_id
> _atom_site.label_entity_id
> _atom_site.label_seq_id
> _atom_site.pdbx_PDB_ins_code
> _atom_site.Cartn_x
> _atom_site.Cartn_y
> _atom_site.Cartn_z
> ...
> _atom_site.auth_seq_id
> _atom_site.auth_comp_id
> _atom_site.auth_asym_id
> _atom_site.auth_atom_id
> _atom_site.pdbx_PDB_model_num
>
> ATOM 1     N N     . LEU A 1 1  ... 3  LEU A N    1
> ATOM 2     C CA    . LEU A 1 1  ... 3  LEU A CA   1
> ATOM 3     C C     . LEU A 1 1  ... 3  LEU A C    1
> ...
> ATOM 13    H HB2   . LEU A 1 1  ... 3  LEU A 1HB  1
> ATOM 14    H HB3   . LEU A 1 1  ... 3  LEU A 2HB  1
>
>
>
> So that translation is a snap, I think.

Glad it looks easy to you :-)

I'll call for your help when I start implementing it :-)

> What other files are you interested in reading or writing?

Nothing else.

I would just like things to work well with mmcif files ... since that is
supposed to be the 'new' standard.

That way, I could start reporting bugs to the PDB in .cif files instead of
.pdb files.

I also assumed that it would be best if the students of the future used
the standard labelling instead of the .pdb * format. So I was hoping to
start using the ' names within Jmol.

I now fear that this is too much for me to bite off at this time ... I am
going to let it rest for a while.


Miguel



-------------------------------------------------------
This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference
Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer
Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA
REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND
_______________________________________________
Jmol-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jmol-developers

Re: [Jmol-developers] * in .pdb <-> ' in .cif

Reply via email to