Dear all,

I am producing rdkit conformers and writing them to pdb files but am finding 
the atom indexing in rdkit is different from the written pdb. I would like this 
because I want to do a substructure search (using rdkit) to give me a handle on 
these atoms in the pdbfile.

Apologies if this has been discussed before.

Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but 4,5 
in the pdb file):

Thanks,

Susan

*********************

In [1]: import rdkit

In [2]: from rdkit import Chem
   ...: from rdkit.Chem import AllChem
   ...: from rdkit.Chem.Draw import IPythonConsole
   ...:

In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1")
   ...: idx = AllChem.EmbedMultipleConfs(mol,numConfs=1,randomSeed=0xf00d,
   ...:                                      
useExpTorsionAnglePrefs=True,useBasicKnowledge=True)
   ...:

In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)'))
Out[4]: (3, 4)

In [5]: Chem.MolToPDBFile(mol,'./test.pdb')

In [6]: import biopandas
   ...: from biopandas.pdb import PandasPDB
   ...: ppdb = PandasPDB()
   ...: ppdb.read_pdb('./test.pdb')
   ...: ppdb.df['HETATM']
   ...:
Out[6]:
  record_name  atom_number blank_1 atom_name alt_loc residue_name blank_2  \
0      HETATM            1                C1                  UNL
1      HETATM            2                C2                  UNL
2      HETATM            3                C3                  UNL
3      HETATM            4                C4                  UNL
4      HETATM            5                O1                  UNL
5      HETATM            6                C5                  UNL
6      HETATM            7                C6                  UNL
7      HETATM            8                C7                  UNL
8      HETATM            9                C8                  UNL
9      HETATM           10                C9                  UNL

  chain_id  residue_number insertion    ...    x_coord  y_coord  z_coord  \
0                        1              ...      0.176    1.911    1.137
1                        1              ...     -0.513    0.759    0.511
2                        1              ...      0.272   -0.184   -0.139
3                        1              ...      1.717   -0.056   -0.210
4                        1              ...      2.406   -0.917   -0.801
5                        1              ...      2.344    1.118    0.435
6                        1              ...     -0.332   -1.286   -0.743
7                        1              ...     -1.696   -1.416   -0.682
8                        1              ...     -2.495   -0.504   -0.048
9                        1              ...     -1.879    0.575    0.540

   occupancy  b_factor  blank_4 segment_id element_symbol charge  line_idx
0        1.0       0.0                                  C    NaN         0
1        1.0       0.0                                  C    NaN         1
2        1.0       0.0                                  C    NaN         2
3        1.0       0.0                                  C    NaN         3
4        1.0       0.0                                  O    NaN         4
5        1.0       0.0                                  C    NaN         5
6        1.0       0.0                                  C    NaN         6
7        1.0       0.0                                  C    NaN         7
8        1.0       0.0                                  C    NaN         8
9        1.0       0.0                                  C    NaN         9

[10 rows x 21 columns]

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to