Hi all!

I'm having problems reading in a PDB file with altloc B.... See below code
that tries to read in residue 702, chain C in 4kio.pdb (
https://files.rcsb.org/view/4KIO.pdb) . A mol object is returned but it
hasn’t got anything in it (or at least it returns an empty string when I do
Chem.MolFromSmiles() on it.)

In [1]: import rdkit

In [2]: from rdkit import Chem

In [3]: rdkit.__version__

Out[3]: '2019.03.2'

In [4]: m_altlocB = Chem.MolFromPDBBlock("""

   ...: HETATM 7886  C1 BG5K C 702      -0.945  12.634  14.174  0.51 48.52
          C

   ...: HETATM 7887  C2 BG5K C 702      -0.880  12.457  12.854  0.51 49.59
          C

   ...: HETATM 7888  C3 BG5K C 702      -2.175  12.105  12.250  0.51 50.84
          C

   ...: HETATM 7889  O4 BG5K C 702      -3.162  12.016  12.973  0.51 56.22
          O

   ...: HETATM 7890  N5 BG5K C 702      -2.196  11.880  10.884  0.51 46.36
          N

   ...: HETATM 7891  C6 BG5K C 702      -3.381  11.554  10.123  0.51 42.61
          C

   ...: HETATM 7892  C7 BG5K C 702      -2.994  11.820   8.683  0.51 34.05
          C

   ...: HETATM 7893  C8 BG5K C 702      -1.767  12.664   8.829  0.51 37.06
          C

   ...: HETATM 7894  C9 BG5K C 702      -1.088  12.006   9.994  0.51 39.00
          C

   ...: HETATM 7895  N10BG5K C 702      -0.970  12.778   7.632  0.51 32.64
          N

   ...: HETATM 7896  C11BG5K C 702      -0.892  13.901   6.897  0.51 31.98
          C

   ...: HETATM 7897  N12BG5K C 702       0.291  14.135   6.316  0.51 32.89
          N

   ...: HETATM 7898  C13BG5K C 702       0.365  15.236   5.562  0.51 37.92
          C

   ...: HETATM 7899  C14BG5K C 702       1.667  15.577   4.898  0.51 45.61
          C

   ...: HETATM 7900  N15BG5K C 702       2.013  14.870   3.674  0.51 54.36
          N

   ...: HETATM 7901  C16BG5K C 702       2.736  13.630   3.932  0.51 61.64
          C

   ...: HETATM 7902  C17BG5K C 702       3.129  13.030   2.594  0.51 67.85
          C

   ...: HETATM 7903  O18BG5K C 702       1.993  12.804   1.757  0.51 70.80
          O

   ...: HETATM 7904  C19BG5K C 702       1.368  14.060   1.475  0.51 61.74
          C

   ...: HETATM 7905  C20BG5K C 702       0.884  14.647   2.784  0.51 59.26
          C

   ...: HETATM 7906  C21BG5K C 702      -0.704  16.091   5.410  0.51 30.62
          C

   ...: HETATM 7907  C22BG5K C 702      -1.867  15.765   6.060  0.51 27.18
          C

   ...: HETATM 7908  N23BG5K C 702      -3.002  16.585   5.957  0.51 24.96
          N

   ...: HETATM 7909  C24BG5K C 702      -4.148  16.590   6.655  0.51 25.09
          C

   ...: HETATM 7910  N25BG5K C 702      -5.074  17.504   6.513  0.51 23.42
          N

   ...: HETATM 7911  C26BG5K C 702      -6.122  17.226   7.351  0.51 23.35
          C

   ...: HETATM 7912  C27BG5K C 702      -7.284  17.952   7.501  0.51 24.35
          C

   ...: HETATM 7913  C28BG5K C 702      -8.201  17.482   8.421  0.51 25.73
          C

   ...: HETATM 7914  C29BG5K C 702      -7.973  16.341   9.153  0.51 25.63
          C

   ...: HETATM 7915  N30BG5K C 702      -6.855  15.637   9.017  0.51 29.00
          N

   ...: HETATM 7916  C31BG5K C 702      -5.966  16.100   8.129  0.51 25.96
          C

   ...: HETATM 7917  S32BG5K C 702      -4.462  15.337   7.792  0.51 27.85
          S

   ...: HETATM 7918  N33BG5K C 702      -2.005  14.665   6.815  0.51 30.08
          N

   ...: """)

In [5]: Chem.MolToSmiles(m_altlocB)

Out[5]: ''

But if I change the altloc column which is column 17 from B to A then it
reads in and prints the SMILES fine.

In [6]: m_altlocA = Chem.MolFromPDBBlock("""

   ...: HETATM 7886  C1 AG5K C 702      -0.945  12.634  14.174  0.51 48.52
          C

   ...: HETATM 7887  C2 AG5K C 702      -0.880  12.457  12.854  0.51 49.59
          C

   ...: HETATM 7888  C3 AG5K C 702      -2.175  12.105  12.250  0.51 50.84
          C

   ...: HETATM 7889  O4 AG5K C 702      -3.162  12.016  12.973  0.51 56.22
          O

   ...: HETATM 7890  N5 AG5K C 702      -2.196  11.880  10.884  0.51 46.36
          N

   ...: HETATM 7891  C6 AG5K C 702      -3.381  11.554  10.123  0.51 42.61
          C

   ...: HETATM 7892  C7 AG5K C 702      -2.994  11.820   8.683  0.51 34.05
          C

   ...: HETATM 7893  C8 AG5K C 702      -1.767  12.664   8.829  0.51 37.06
          C

   ...: HETATM 7894  C9 AG5K C 702      -1.088  12.006   9.994  0.51 39.00
          C

   ...: HETATM 7895  N10AG5K C 702      -0.970  12.778   7.632  0.51 32.64
          N

   ...: HETATM 7896  C11AG5K C 702      -0.892  13.901   6.897  0.51 31.98
          C

   ...: HETATM 7897  N12AG5K C 702       0.291  14.135   6.316  0.51 32.89
          N

   ...: HETATM 7898  C13AG5K C 702       0.365  15.236   5.562  0.51 37.92
          C

   ...: HETATM 7899  C14AG5K C 702       1.667  15.577   4.898  0.51 45.61
          C

   ...: HETATM 7900  N15AG5K C 702       2.013  14.870   3.674  0.51 54.36
          N

   ...: HETATM 7901  C16AG5K C 702       2.736  13.630   3.932  0.51 61.64
          C

   ...: HETATM 7902  C17AG5K C 702       3.129  13.030   2.594  0.51 67.85
          C

   ...: HETATM 7903  O18AG5K C 702       1.993  12.804   1.757  0.51 70.80
          O

   ...: HETATM 7904  C19AG5K C 702       1.368  14.060   1.475  0.51 61.74
          C

   ...: HETATM 7905  C20AG5K C 702       0.884  14.647   2.784  0.51 59.26
          C

   ...: HETATM 7906  C21AG5K C 702      -0.704  16.091   5.410  0.51 30.62
          C

   ...: HETATM 7907  C22AG5K C 702      -1.867  15.765   6.060  0.51 27.18
          C

   ...: HETATM 7908  N23AG5K C 702      -3.002  16.585   5.957  0.51 24.96
          N

   ...: HETATM 7909  C24AG5K C 702      -4.148  16.590   6.655  0.51 25.09
          C

   ...: HETATM 7910  N25AG5K C 702      -5.074  17.504   6.513  0.51 23.42
          N

   ...: HETATM 7911  C26AG5K C 702      -6.122  17.226   7.351  0.51 23.35
          C

   ...: HETATM 7912  C27AG5K C 702      -7.284  17.952   7.501  0.51 24.35
          C

   ...: HETATM 7913  C28AG5K C 702      -8.201  17.482   8.421  0.51 25.73
          C

   ...: HETATM 7914  C29AG5K C 702      -7.973  16.341   9.153  0.51 25.63
          C

   ...: HETATM 7915  N30AG5K C 702      -6.855  15.637   9.017  0.51 29.00
          N

   ...: HETATM 7916  C31AG5K C 702      -5.966  16.100   8.129  0.51 25.96
          C

   ...: HETATM 7917  S32AG5K C 702      -4.462  15.337   7.792  0.51 27.85
          S

   ...: HETATM 7918  N33AG5K C 702      -2.005  14.665   6.815  0.51 30.08
          N

   ...: """)

In [7]: Chem.MolToSmiles(m_altlocA)

Out[7]: 'CC(O)N1CC[C@H](NC2NC(CN3CCOCC3)CC(NC3NC4CCCNC4S3)N2)C1'

Am I doing something wrong? I also attach this code as a .ipynb.

Many thanks in advance!

Susan

Attachment: rdkit_issue_with_altloc_B.ipynb
Description: Binary data

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to