Thanks All - I think I am in a good place now.

I can get the SMILES from Paul's mmcif links and then I can use Sereina
magic three lines to do what I want.  I'd cross my fingers - but with RDKit
you don't need to.
This works for all Chemical Components (or what other fashionable name they
go by these days) in the PDB.

For posterity: I have found a post in the mailing list started by James
which sheds some light on this:
https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03481.html




On 13 January 2014 19:46, sereina riniker <sereina.rini...@gmail.com> wrote:

> Hi JP,
>
> If you have also a SMILES of the molecule you want to read from PDB, you
> can assign the bond orders based on this template:
>
> tmp = Chem.MolFromPDBFile(yourfilename)
> template = Chem.MolFromSmiles(yoursmiles)
> mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)
>
> Is this what you're looking for?
>
> Best,
> Sereina
>
>
> 2014/1/13 JP <jeanpaul.ebe...@inhibox.com>
>
>> RDKitters!
>>
>> Finally back on the mailing list!
>>
>> I am sure we've been through this at the UGM (my mind must have wandered
>> off!), but a quick question about the PDB reader and bond perception.  Is
>> this supported with the current PDB reader?  I remember that someone
>> (PaulE, perhaps?) was saying bond perception was painful, but there was
>> some dictionary for PDB ligands which helps (any idea the name of this
>> dictionary?).
>>
>> To the technical details.
>>
>> I am reading in the following PDB file with a simple MolFromPDBFile()
>> call:
>>
>> HETATM    1  O1P 84T A1862     -27.016   9.387 -72.564  1.00 20.81
>>     O
>> HETATM    2  P   84T A1862     -27.282   9.818 -73.968  1.00 19.65
>>     P
>> HETATM    3  O2P 84T A1862     -27.881  11.176 -74.182  1.00 21.49
>>     O
>> HETATM    4  N   84T A1862     -25.869   9.583 -74.813  1.00 19.78
>>     N
>> HETATM    5  C   84T A1862     -25.759  10.010 -76.075  1.00 19.97
>>     C
>> HETATM    6  CA  84T A1862     -24.493   9.748 -76.807  1.00 19.75
>>     C
>> HETATM    7  CB  84T A1862     -24.794   8.678 -77.847  1.00 19.73
>>     C
>> HETATM    8  CG  84T A1862     -23.571   8.324 -78.681  1.00 19.70
>>     C
>> HETATM    9  CD2 84T A1862     -23.309   9.519 -79.611  1.00 18.49
>>     C
>> HETATM   10  CD1 84T A1862     -23.863   6.932 -79.305  1.00 18.60
>>     C
>> HETATM   11  OHB 84T A1862     -25.210   7.467 -77.223  1.00 19.17
>>     O
>> HETATM   12  OH  84T A1862     -23.549   9.127 -75.984  1.00 20.33
>>     O
>> HETATM   13  O   84T A1862     -26.672  10.517 -76.692  1.00 20.26
>>     O
>> HETATM   14  O5' 84T A1862     -28.377   8.861 -74.619  1.00 19.39
>>     O
>> HETATM   15  C5' 84T A1862     -28.002   7.536 -74.954  1.00 18.47
>>     C
>> HETATM   16  C4' 84T A1862     -28.909   7.000 -76.012  1.00 18.24
>>     C
>> HETATM   17  C3' 84T A1862     -28.901   7.826 -77.298  1.00 18.28
>>     C
>> HETATM   18  C2' 84T A1862     -30.318   7.610 -77.768  1.00 18.69
>>     C
>> HETATM   19  O2' 84T A1862     -30.789   8.641 -78.581  1.00 19.64
>>     O
>> HETATM   20  O4' 84T A1862     -30.262   6.951 -75.529  1.00 18.80
>>     O
>> HETATM   21  C1' 84T A1862     -31.152   7.470 -76.521  1.00 19.01
>>     C
>> HETATM   22  N9  84T A1862     -31.753   8.732 -76.009  1.00 20.08
>>     N
>> HETATM   23  C4  84T A1862     -33.033   9.013 -76.158  1.00 21.10
>>     C
>> HETATM   24  N3  84T A1862     -34.018   8.339 -76.786  1.00 21.58
>>     N
>> HETATM   25  C2  84T A1862     -35.263   8.846 -76.830  1.00 21.95
>>     C
>> HETATM   26  C8  84T A1862     -31.223   9.701 -75.291  1.00 20.27
>>     C
>> HETATM   27  N7  84T A1862     -32.173  10.618 -75.019  1.00 21.28
>>     N
>> HETATM   28  C5  84T A1862     -33.315  10.213 -75.563  1.00 21.81
>>     C
>> HETATM   29  C6  84T A1862     -34.624  10.702 -75.627  1.00 22.85
>>     C
>> HETATM   30  N1  84T A1862     -35.550  10.010 -76.285  1.00 22.44
>>     N
>> HETATM   31  N6  84T A1862     -35.008  11.862 -75.052  1.00 23.86
>>     N
>> TER
>> END
>>
>> But I am losing all the double bond (and aromatic) information:
>>
>> m = Chem.MolFromPDBFile(sys.argv[1])
>> print Chem.MolToSmiles(m)
>>
>> Gives me:
>>
>> CC(C)C(O)C(O)C(O)NP(O)(O)OCC1CC(O)C(N2CNC3C2NCNC3N)O1
>>
>> As usual, many thanks for your time,
>>
>> -
>> Jean-Paul Ebejer
>> Early Stage Researcher
>>
>>
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to