Thanks All - I think I am in a good place now.
I can get the SMILES from Paul's mmcif links and then I can use Sereina
magic three lines to do what I want. I'd cross my fingers - but with RDKit
you don't need to.
This works for all Chemical Components (or what other fashionable name they
go by these days) in the PDB.
For posterity: I have found a post in the mailing list started by James
which sheds some light on this:
https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03481.html
On 13 January 2014 19:46, sereina riniker <sereina.rini...@gmail.com> wrote:
> Hi JP,
>
> If you have also a SMILES of the molecule you want to read from PDB, you
> can assign the bond orders based on this template:
>
> tmp = Chem.MolFromPDBFile(yourfilename)
> template = Chem.MolFromSmiles(yoursmiles)
> mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)
>
> Is this what you're looking for?
>
> Best,
> Sereina
>
>
> 2014/1/13 JP <jeanpaul.ebe...@inhibox.com>
>
>> RDKitters!
>>
>> Finally back on the mailing list!
>>
>> I am sure we've been through this at the UGM (my mind must have wandered
>> off!), but a quick question about the PDB reader and bond perception. Is
>> this supported with the current PDB reader? I remember that someone
>> (PaulE, perhaps?) was saying bond perception was painful, but there was
>> some dictionary for PDB ligands which helps (any idea the name of this
>> dictionary?).
>>
>> To the technical details.
>>
>> I am reading in the following PDB file with a simple MolFromPDBFile()
>> call:
>>
>> HETATM 1 O1P 84T A1862 -27.016 9.387 -72.564 1.00 20.81
>> O
>> HETATM 2 P 84T A1862 -27.282 9.818 -73.968 1.00 19.65
>> P
>> HETATM 3 O2P 84T A1862 -27.881 11.176 -74.182 1.00 21.49
>> O
>> HETATM 4 N 84T A1862 -25.869 9.583 -74.813 1.00 19.78
>> N
>> HETATM 5 C 84T A1862 -25.759 10.010 -76.075 1.00 19.97
>> C
>> HETATM 6 CA 84T A1862 -24.493 9.748 -76.807 1.00 19.75
>> C
>> HETATM 7 CB 84T A1862 -24.794 8.678 -77.847 1.00 19.73
>> C
>> HETATM 8 CG 84T A1862 -23.571 8.324 -78.681 1.00 19.70
>> C
>> HETATM 9 CD2 84T A1862 -23.309 9.519 -79.611 1.00 18.49
>> C
>> HETATM 10 CD1 84T A1862 -23.863 6.932 -79.305 1.00 18.60
>> C
>> HETATM 11 OHB 84T A1862 -25.210 7.467 -77.223 1.00 19.17
>> O
>> HETATM 12 OH 84T A1862 -23.549 9.127 -75.984 1.00 20.33
>> O
>> HETATM 13 O 84T A1862 -26.672 10.517 -76.692 1.00 20.26
>> O
>> HETATM 14 O5' 84T A1862 -28.377 8.861 -74.619 1.00 19.39
>> O
>> HETATM 15 C5' 84T A1862 -28.002 7.536 -74.954 1.00 18.47
>> C
>> HETATM 16 C4' 84T A1862 -28.909 7.000 -76.012 1.00 18.24
>> C
>> HETATM 17 C3' 84T A1862 -28.901 7.826 -77.298 1.00 18.28
>> C
>> HETATM 18 C2' 84T A1862 -30.318 7.610 -77.768 1.00 18.69
>> C
>> HETATM 19 O2' 84T A1862 -30.789 8.641 -78.581 1.00 19.64
>> O
>> HETATM 20 O4' 84T A1862 -30.262 6.951 -75.529 1.00 18.80
>> O
>> HETATM 21 C1' 84T A1862 -31.152 7.470 -76.521 1.00 19.01
>> C
>> HETATM 22 N9 84T A1862 -31.753 8.732 -76.009 1.00 20.08
>> N
>> HETATM 23 C4 84T A1862 -33.033 9.013 -76.158 1.00 21.10
>> C
>> HETATM 24 N3 84T A1862 -34.018 8.339 -76.786 1.00 21.58
>> N
>> HETATM 25 C2 84T A1862 -35.263 8.846 -76.830 1.00 21.95
>> C
>> HETATM 26 C8 84T A1862 -31.223 9.701 -75.291 1.00 20.27
>> C
>> HETATM 27 N7 84T A1862 -32.173 10.618 -75.019 1.00 21.28
>> N
>> HETATM 28 C5 84T A1862 -33.315 10.213 -75.563 1.00 21.81
>> C
>> HETATM 29 C6 84T A1862 -34.624 10.702 -75.627 1.00 22.85
>> C
>> HETATM 30 N1 84T A1862 -35.550 10.010 -76.285 1.00 22.44
>> N
>> HETATM 31 N6 84T A1862 -35.008 11.862 -75.052 1.00 23.86
>> N
>> TER
>> END
>>
>> But I am losing all the double bond (and aromatic) information:
>>
>> m = Chem.MolFromPDBFile(sys.argv[1])
>> print Chem.MolToSmiles(m)
>>
>> Gives me:
>>
>> CC(C)C(O)C(O)C(O)NP(O)(O)OCC1CC(O)C(N2CNC3C2NCNC3N)O1
>>
>> As usual, many thanks for your time,
>>
>> -
>> Jean-Paul Ebejer
>> Early Stage Researcher
>>
>>
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss