Apologies all -- but I am still having problems with this.

Reading
https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03485.html

"As far as I understood, the PDB reader assigns bond orders to the amino
acids in a protein, but if a ligand is present it puts all bonds of it to
SINGLE bonds as auto bond-type perception is not trivial (see Roger's
comments)."

However I am unable to get bond orders for the protein side - am I doing
something wrong or is this the intended behaviour ?
I imagine I can use AssignBondOrdersFromTemplate() for the 20 amino acids
and set these myself -- or is there a better way to do this?

Also, is there a way to make AssignBondOrdersFromTemplate assign bond
orders to all matches?

>>> import rdkit
>>> from rdkit import Chem
>>> temp = Chem.MolFromSmiles('C=O')
>>> mol = Chem.MolFromSmiles('C(O)CC(O)')
>>> from rdkit.Chem import AllChem
>>> m2 = AllChem.AssignBondOrdersFromTemplate(temp, mol)
[12:24:56] WARNING: More than one matching pattern found - picking one
>>> print Chem.MolToSmiles(m2) # was expecting O=CCC=O
O=CCCO


Also another thing I don't quite understand is in the following below code,
I get a "WARNING: More than one matching pattern found - picking one" but
how can my template match multiple times (this is not symettrical) ?

# (Using RDKit_2013_09_1)
import rdkit
from rdkit import Chem
from rdkit.Chem import AllChem

ligand_mol  = Chem.MolFromPDBBlock("""HETATM    1  C1  MRC A1993
 30.994  82.769  82.139  1.00 18.68           C
HETATM    2  C2  MRC A1993      29.949  82.382  81.280  1.00 18.38
  C
HETATM    3  C3  MRC A1993      28.809  83.090  80.875  1.00 16.44
  C
HETATM    4  C4  MRC A1993      27.794  82.511  79.886  1.00 17.11
  C
HETATM    5  C5  MRC A1993      26.268  82.360  79.965  1.00 16.74
  C
HETATM    6  C6  MRC A1993      25.256  81.832  78.911  1.00 17.00
  C
HETATM    7  C7  MRC A1993      23.832  81.867  79.556  1.00 17.45
  C
HETATM    8  C8  MRC A1993      23.758  81.056  80.927  1.00 16.89
  C
HETATM    9  C9  MRC A1993      23.820  79.467  80.419  1.00 17.84
  C
HETATM   10  C10 MRC A1993      22.833  78.610  79.550  1.00 19.48
  C
HETATM   11  C11 MRC A1993      22.999  78.593  78.193  1.00 20.56
  C
HETATM   12  C12 MRC A1993      21.733  78.839  77.305  1.00 20.86
  C
HETATM   13  C13 MRC A1993      21.779  78.052  75.821  1.00 20.74
  C
HETATM   14  C14 MRC A1993      20.323  77.662  75.537  1.00 22.44
  C
HETATM   15  C15 MRC A1993      28.456  84.523  81.348  1.00 12.97
  C
HETATM   16  C16 MRC A1993      24.899  81.634  81.814  1.00 16.07
  C
HETATM   17  C1' MRC A1993      38.561  75.401  83.188  1.00 53.39
  C
HETATM   18  O1P MRC A1993      39.367  74.705  83.841  1.00 53.58
  O
HETATM   19  O1Q MRC A1993      38.963  76.034  82.185  1.00 52.93
  O
HETATM   20  C2' MRC A1993      37.074  75.480  83.615  1.00 51.57
  C
HETATM   21  C3' MRC A1993      36.915  75.997  85.071  1.00 48.41
  C
HETATM   22  C4' MRC A1993      35.513  76.588  85.323  1.00 45.07
  C
HETATM   23  C5' MRC A1993      35.443  78.068  84.897  1.00 41.55
  C
HETATM   24  C6' MRC A1993      34.033  78.631  85.167  1.00 37.19
  C
HETATM   25  C7' MRC A1993      33.490  79.356  83.929  1.00 34.17
  C
HETATM   26  C8' MRC A1993      33.454  80.886  84.151  1.00 31.34
  C
HETATM   27  C9' MRC A1993      32.082  81.519  83.803  1.00 27.63
  C
HETATM   28  O1A MRC A1993      32.056  81.880  82.413  1.00 22.28
  O
HETATM   29  O1B MRC A1993      31.044  83.885  82.667  1.00 20.31
  O
HETATM   30  O5  MRC A1993      26.209  81.625  81.183  1.00 16.19
  O
HETATM   31  O7  MRC A1993      23.503  83.224  79.735  1.00 14.98
  O
HETATM   32  O6  MRC A1993      25.399  82.787  77.821  1.00 15.00
  O
HETATM   33  O10 MRC A1993      22.868  77.384  78.981  1.00 21.90
  O
HETATM   34  C17 MRC A1993      21.395  80.405  77.027  1.00 20.53
  C
HETATM   35  O13 MRC A1993      22.524  76.868  75.987  1.00 21.25
  O
TER
END""")

template_ligand_mol = Chem.MolFromSmiles("C[C@H](O)[C@H](C)[C@@H]1O[C@H
]1C[C@H]2CO[C@@H](C/C(C)=C/C(=O)OCCCCCCCCC(O)=O)[C@@H](O)[C@H]2O")

ligand_mol_with_bonds =
AllChem.AssignBondOrdersFromTemplate(template_ligand_mol, ligand_mol)
# [12:33:39] WARNING: More than one matching pattern found - picking one

print Chem.MolToSmiles(ligand_mol)
# CC(CC(O)OCCCCCCCCC(O)O)CC1OCC(CC2OC2C(C)C(C)O)C(O)C1O
print Chem.MolToSmiles(ligand_mol_with_bonds)
# CC(=CC(=O)OCCCCCCCCC(=O)O)CC1OCC(CC2OC2C(C)C(C)O)C(O)C1O

Any help would be greatly appreciated.

Thanks,
JP


On 13 January 2014 21:02, JP <jeanpaul.ebe...@inhibox.com> wrote:
>
> Thanks All - I think I am in a good place now.
>
> I can get the SMILES from Paul's mmcif links and then I can use Sereina
magic three lines to do what I want.  I'd cross my fingers - but with RDKit
you don't need to.
> This works for all Chemical Components (or what other fashionable name
they go by these days) in the PDB.
>
> For posterity: I have found a post in the mailing list started by James
which sheds some light on this:
>
https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03481.html
>
>
>
>
> On 13 January 2014 19:46, sereina riniker <sereina.rini...@gmail.com>
wrote:
>>
>> Hi JP,
>>
>> If you have also a SMILES of the molecule you want to read from PDB, you
can assign the bond orders based on this template:
>>
>> tmp = Chem.MolFromPDBFile(yourfilename)
>> template = Chem.MolFromSmiles(yoursmiles)
>> mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)
>>
>> Is this what you're looking for?
>>
>> Best,
>> Sereina
>>
>>
>> 2014/1/13 JP <jeanpaul.ebe...@inhibox.com>
>>>
>>> RDKitters!
>>>
>>> Finally back on the mailing list!
>>>
>>> I am sure we've been through this at the UGM (my mind must have
wandered off!), but a quick question about the PDB reader and bond
perception.  Is this supported with the current PDB reader?  I remember
that someone (PaulE, perhaps?) was saying bond perception was painful, but
there was some dictionary for PDB ligands which helps (any idea the name of
this dictionary?).
>>>
>>> To the technical details.
>>>
>>> I am reading in the following PDB file with a simple MolFromPDBFile()
call:
>>>
>>> HETATM    1  O1P 84T A1862     -27.016   9.387 -72.564  1.00 20.81
      O
>>> HETATM    2  P   84T A1862     -27.282   9.818 -73.968  1.00 19.65
      P
>>> HETATM    3  O2P 84T A1862     -27.881  11.176 -74.182  1.00 21.49
      O
>>> HETATM    4  N   84T A1862     -25.869   9.583 -74.813  1.00 19.78
      N
>>> HETATM    5  C   84T A1862     -25.759  10.010 -76.075  1.00 19.97
      C
>>> HETATM    6  CA  84T A1862     -24.493   9.748 -76.807  1.00 19.75
      C
>>> HETATM    7  CB  84T A1862     -24.794   8.678 -77.847  1.00 19.73
      C
>>> HETATM    8  CG  84T A1862     -23.571   8.324 -78.681  1.00 19.70
      C
>>> HETATM    9  CD2 84T A1862     -23.309   9.519 -79.611  1.00 18.49
      C
>>> HETATM   10  CD1 84T A1862     -23.863   6.932 -79.305  1.00 18.60
      C
>>> HETATM   11  OHB 84T A1862     -25.210   7.467 -77.223  1.00 19.17
      O
>>> HETATM   12  OH  84T A1862     -23.549   9.127 -75.984  1.00 20.33
      O
>>> HETATM   13  O   84T A1862     -26.672  10.517 -76.692  1.00 20.26
      O
>>> HETATM   14  O5' 84T A1862     -28.377   8.861 -74.619  1.00 19.39
      O
>>> HETATM   15  C5' 84T A1862     -28.002   7.536 -74.954  1.00 18.47
      C
>>> HETATM   16  C4' 84T A1862     -28.909   7.000 -76.012  1.00 18.24
      C
>>> HETATM   17  C3' 84T A1862     -28.901   7.826 -77.298  1.00 18.28
      C
>>> HETATM   18  C2' 84T A1862     -30.318   7.610 -77.768  1.00 18.69
      C
>>> HETATM   19  O2' 84T A1862     -30.789   8.641 -78.581  1.00 19.64
      O
>>> HETATM   20  O4' 84T A1862     -30.262   6.951 -75.529  1.00 18.80
      O
>>> HETATM   21  C1' 84T A1862     -31.152   7.470 -76.521  1.00 19.01
      C
>>> HETATM   22  N9  84T A1862     -31.753   8.732 -76.009  1.00 20.08
      N
>>> HETATM   23  C4  84T A1862     -33.033   9.013 -76.158  1.00 21.10
      C
>>> HETATM   24  N3  84T A1862     -34.018   8.339 -76.786  1.00 21.58
      N
>>> HETATM   25  C2  84T A1862     -35.263   8.846 -76.830  1.00 21.95
      C
>>> HETATM   26  C8  84T A1862     -31.223   9.701 -75.291  1.00 20.27
      C
>>> HETATM   27  N7  84T A1862     -32.173  10.618 -75.019  1.00 21.28
      N
>>> HETATM   28  C5  84T A1862     -33.315  10.213 -75.563  1.00 21.81
      C
>>> HETATM   29  C6  84T A1862     -34.624  10.702 -75.627  1.00 22.85
      C
>>> HETATM   30  N1  84T A1862     -35.550  10.010 -76.285  1.00 22.44
      N
>>> HETATM   31  N6  84T A1862     -35.008  11.862 -75.052  1.00 23.86
      N
>>> TER
>>> END
>>>
>>> But I am losing all the double bond (and aromatic) information:
>>>
>>> m = Chem.MolFromPDBFile(sys.argv[1])
>>> print Chem.MolToSmiles(m)
>>>
>>> Gives me:
>>>
>>> CC(C)C(O)C(O)C(O)NP(O)(O)OCC1CC(O)C(N2CNC3C2NCNC3N)O1
>>>
>>> As usual, many thanks for your time,
>>>
>>> -
>>> Jean-Paul Ebejer
>>> Early Stage Researcher
>>>
>>>
------------------------------------------------------------------------------
>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>> Critical Workloads, Development Environments & Everything In Between.
>>> Get a Quote or Start a Free Trial Today.
>>>
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to