Dear Susan,

the reason is that PDBAtomLine() ignores records where the alternate
location is different from ' ', 'A' or '1':

https://github.com/rdkit/rdkit/blob/e7e17adc4ef822d2663fa6e1ba5b978512c7a8b4/Code/GraphMol/FileParsers/PDBParser.cpp#L62

I have run myself in the past into PDB files which have a single alternate
location which is none of the above (e.g., only 'B').
To avoid the problem you may set flavor=1 in MolFromPDBBlock():

m_altlocB = Chem.MolFromPDBBlock("""
HETATM 7886  C1 BG5K C 702      -0.945  12.634  14.174  0.51 48.52
  C
HETATM 7887  C2 BG5K C 702      -0.880  12.457  12.854  0.51 49.59
  C
HETATM 7888  C3 BG5K C 702      -2.175  12.105  12.250  0.51 50.84
  C
HETATM 7889  O4 BG5K C 702      -3.162  12.016  12.973  0.51 56.22
  O
HETATM 7890  N5 BG5K C 702      -2.196  11.880  10.884  0.51 46.36
  N
HETATM 7891  C6 BG5K C 702      -3.381  11.554  10.123  0.51 42.61
  C
HETATM 7892  C7 BG5K C 702      -2.994  11.820   8.683  0.51 34.05
  C
HETATM 7893  C8 BG5K C 702      -1.767  12.664   8.829  0.51 37.06
  C
HETATM 7894  C9 BG5K C 702      -1.088  12.006   9.994  0.51 39.00
  C
HETATM 7895  N10BG5K C 702      -0.970  12.778   7.632  0.51 32.64
  N
HETATM 7896  C11BG5K C 702      -0.892  13.901   6.897  0.51 31.98
  C
HETATM 7897  N12BG5K C 702       0.291  14.135   6.316  0.51 32.89
  N
HETATM 7898  C13BG5K C 702       0.365  15.236   5.562  0.51 37.92
  C
HETATM 7899  C14BG5K C 702       1.667  15.577   4.898  0.51 45.61
  C
HETATM 7900  N15BG5K C 702       2.013  14.870   3.674  0.51 54.36
  N
HETATM 7901  C16BG5K C 702       2.736  13.630   3.932  0.51 61.64
  C
HETATM 7902  C17BG5K C 702       3.129  13.030   2.594  0.51 67.85
  C
HETATM 7903  O18BG5K C 702       1.993  12.804   1.757  0.51 70.80
  O
HETATM 7904  C19BG5K C 702       1.368  14.060   1.475  0.51 61.74
  C
HETATM 7905  C20BG5K C 702       0.884  14.647   2.784  0.51 59.26
  C
HETATM 7906  C21BG5K C 702      -0.704  16.091   5.410  0.51 30.62
  C
HETATM 7907  C22BG5K C 702      -1.867  15.765   6.060  0.51 27.18
  C
HETATM 7908  N23BG5K C 702      -3.002  16.585   5.957  0.51 24.96
  N
HETATM 7909  C24BG5K C 702      -4.148  16.590   6.655  0.51 25.09
  C
HETATM 7910  N25BG5K C 702      -5.074  17.504   6.513  0.51 23.42
  N
HETATM 7911  C26BG5K C 702      -6.122  17.226   7.351  0.51 23.35
  C
HETATM 7912  C27BG5K C 702      -7.284  17.952   7.501  0.51 24.35
  C
HETATM 7913  C28BG5K C 702      -8.201  17.482   8.421  0.51 25.73
  C
HETATM 7914  C29BG5K C 702      -7.973  16.341   9.153  0.51 25.63
  C
HETATM 7915  N30BG5K C 702      -6.855  15.637   9.017  0.51 29.00
  N
HETATM 7916  C31BG5K C 702      -5.966  16.100   8.129  0.51 25.96
  C
HETATM 7917  S32BG5K C 702      -4.462  15.337   7.792  0.51 27.85
  S
HETATM 7918  N33BG5K C 702      -2.005  14.665   6.815  0.51 30.08
  N
""", flavor=1)
Chem.MolToSmiles(m_altlocB)

'CC(O)N1CC[C@H](NC2NC(CN3CCOCC3)CC(NC3NC4CCCNC4S3)N2)C1'


Cheers,
p.

On Wed, Dec 9, 2020 at 6:07 PM Susan Leung <susanhle...@gmail.com> wrote:

> Hi all!
>
> I'm having problems reading in a PDB file with altloc B.... See below code
> that tries to read in residue 702, chain C in 4kio.pdb (
> https://files.rcsb.org/view/4KIO.pdb) . A mol object is returned but it
> hasn’t got anything in it (or at least it returns an empty string when I do
> Chem.MolFromSmiles() on it.)
>
> In [1]: import rdkit
>
> In [2]: from rdkit import Chem
>
> In [3]: rdkit.__version__
>
> Out[3]: '2019.03.2'
>
> In [4]: m_altlocB = Chem.MolFromPDBBlock("""
>
>    ...: HETATM 7886  C1 BG5K C 702      -0.945  12.634  14.174  0.51 48.52
>           C
>
>    ...: HETATM 7887  C2 BG5K C 702      -0.880  12.457  12.854  0.51 49.59
>           C
>
>    ...: HETATM 7888  C3 BG5K C 702      -2.175  12.105  12.250  0.51 50.84
>           C
>
>    ...: HETATM 7889  O4 BG5K C 702      -3.162  12.016  12.973  0.51 56.22
>           O
>
>    ...: HETATM 7890  N5 BG5K C 702      -2.196  11.880  10.884  0.51 46.36
>           N
>
>    ...: HETATM 7891  C6 BG5K C 702      -3.381  11.554  10.123  0.51 42.61
>           C
>
>    ...: HETATM 7892  C7 BG5K C 702      -2.994  11.820   8.683  0.51 34.05
>           C
>
>    ...: HETATM 7893  C8 BG5K C 702      -1.767  12.664   8.829  0.51 37.06
>           C
>
>    ...: HETATM 7894  C9 BG5K C 702      -1.088  12.006   9.994  0.51 39.00
>           C
>
>    ...: HETATM 7895  N10BG5K C 702      -0.970  12.778   7.632  0.51 32.64
>           N
>
>    ...: HETATM 7896  C11BG5K C 702      -0.892  13.901   6.897  0.51 31.98
>           C
>
>    ...: HETATM 7897  N12BG5K C 702       0.291  14.135   6.316  0.51 32.89
>           N
>
>    ...: HETATM 7898  C13BG5K C 702       0.365  15.236   5.562  0.51 37.92
>           C
>
>    ...: HETATM 7899  C14BG5K C 702       1.667  15.577   4.898  0.51 45.61
>           C
>
>    ...: HETATM 7900  N15BG5K C 702       2.013  14.870   3.674  0.51 54.36
>           N
>
>    ...: HETATM 7901  C16BG5K C 702       2.736  13.630   3.932  0.51 61.64
>           C
>
>    ...: HETATM 7902  C17BG5K C 702       3.129  13.030   2.594  0.51 67.85
>           C
>
>    ...: HETATM 7903  O18BG5K C 702       1.993  12.804   1.757  0.51 70.80
>           O
>
>    ...: HETATM 7904  C19BG5K C 702       1.368  14.060   1.475  0.51 61.74
>           C
>
>    ...: HETATM 7905  C20BG5K C 702       0.884  14.647   2.784  0.51 59.26
>           C
>
>    ...: HETATM 7906  C21BG5K C 702      -0.704  16.091   5.410  0.51 30.62
>           C
>
>    ...: HETATM 7907  C22BG5K C 702      -1.867  15.765   6.060  0.51 27.18
>           C
>
>    ...: HETATM 7908  N23BG5K C 702      -3.002  16.585   5.957  0.51 24.96
>           N
>
>    ...: HETATM 7909  C24BG5K C 702      -4.148  16.590   6.655  0.51 25.09
>           C
>
>    ...: HETATM 7910  N25BG5K C 702      -5.074  17.504   6.513  0.51 23.42
>           N
>
>    ...: HETATM 7911  C26BG5K C 702      -6.122  17.226   7.351  0.51 23.35
>           C
>
>    ...: HETATM 7912  C27BG5K C 702      -7.284  17.952   7.501  0.51 24.35
>           C
>
>    ...: HETATM 7913  C28BG5K C 702      -8.201  17.482   8.421  0.51 25.73
>           C
>
>    ...: HETATM 7914  C29BG5K C 702      -7.973  16.341   9.153  0.51 25.63
>           C
>
>    ...: HETATM 7915  N30BG5K C 702      -6.855  15.637   9.017  0.51 29.00
>           N
>
>    ...: HETATM 7916  C31BG5K C 702      -5.966  16.100   8.129  0.51 25.96
>           C
>
>    ...: HETATM 7917  S32BG5K C 702      -4.462  15.337   7.792  0.51 27.85
>           S
>
>    ...: HETATM 7918  N33BG5K C 702      -2.005  14.665   6.815  0.51 30.08
>           N
>
>    ...: """)
>
> In [5]: Chem.MolToSmiles(m_altlocB)
>
> Out[5]: ''
>
> But if I change the altloc column which is column 17 from B to A then it
> reads in and prints the SMILES fine.
>
> In [6]: m_altlocA = Chem.MolFromPDBBlock("""
>
>    ...: HETATM 7886  C1 AG5K C 702      -0.945  12.634  14.174  0.51 48.52
>           C
>
>    ...: HETATM 7887  C2 AG5K C 702      -0.880  12.457  12.854  0.51 49.59
>           C
>
>    ...: HETATM 7888  C3 AG5K C 702      -2.175  12.105  12.250  0.51 50.84
>           C
>
>    ...: HETATM 7889  O4 AG5K C 702      -3.162  12.016  12.973  0.51 56.22
>           O
>
>    ...: HETATM 7890  N5 AG5K C 702      -2.196  11.880  10.884  0.51 46.36
>           N
>
>    ...: HETATM 7891  C6 AG5K C 702      -3.381  11.554  10.123  0.51 42.61
>           C
>
>    ...: HETATM 7892  C7 AG5K C 702      -2.994  11.820   8.683  0.51 34.05
>           C
>
>    ...: HETATM 7893  C8 AG5K C 702      -1.767  12.664   8.829  0.51 37.06
>           C
>
>    ...: HETATM 7894  C9 AG5K C 702      -1.088  12.006   9.994  0.51 39.00
>           C
>
>    ...: HETATM 7895  N10AG5K C 702      -0.970  12.778   7.632  0.51 32.64
>           N
>
>    ...: HETATM 7896  C11AG5K C 702      -0.892  13.901   6.897  0.51 31.98
>           C
>
>    ...: HETATM 7897  N12AG5K C 702       0.291  14.135   6.316  0.51 32.89
>           N
>
>    ...: HETATM 7898  C13AG5K C 702       0.365  15.236   5.562  0.51 37.92
>           C
>
>    ...: HETATM 7899  C14AG5K C 702       1.667  15.577   4.898  0.51 45.61
>           C
>
>    ...: HETATM 7900  N15AG5K C 702       2.013  14.870   3.674  0.51 54.36
>           N
>
>    ...: HETATM 7901  C16AG5K C 702       2.736  13.630   3.932  0.51 61.64
>           C
>
>    ...: HETATM 7902  C17AG5K C 702       3.129  13.030   2.594  0.51 67.85
>           C
>
>    ...: HETATM 7903  O18AG5K C 702       1.993  12.804   1.757  0.51 70.80
>           O
>
>    ...: HETATM 7904  C19AG5K C 702       1.368  14.060   1.475  0.51 61.74
>           C
>
>    ...: HETATM 7905  C20AG5K C 702       0.884  14.647   2.784  0.51 59.26
>           C
>
>    ...: HETATM 7906  C21AG5K C 702      -0.704  16.091   5.410  0.51 30.62
>           C
>
>    ...: HETATM 7907  C22AG5K C 702      -1.867  15.765   6.060  0.51 27.18
>           C
>
>    ...: HETATM 7908  N23AG5K C 702      -3.002  16.585   5.957  0.51 24.96
>           N
>
>    ...: HETATM 7909  C24AG5K C 702      -4.148  16.590   6.655  0.51 25.09
>           C
>
>    ...: HETATM 7910  N25AG5K C 702      -5.074  17.504   6.513  0.51 23.42
>           N
>
>    ...: HETATM 7911  C26AG5K C 702      -6.122  17.226   7.351  0.51 23.35
>           C
>
>    ...: HETATM 7912  C27AG5K C 702      -7.284  17.952   7.501  0.51 24.35
>           C
>
>    ...: HETATM 7913  C28AG5K C 702      -8.201  17.482   8.421  0.51 25.73
>           C
>
>    ...: HETATM 7914  C29AG5K C 702      -7.973  16.341   9.153  0.51 25.63
>           C
>
>    ...: HETATM 7915  N30AG5K C 702      -6.855  15.637   9.017  0.51 29.00
>           N
>
>    ...: HETATM 7916  C31AG5K C 702      -5.966  16.100   8.129  0.51 25.96
>           C
>
>    ...: HETATM 7917  S32AG5K C 702      -4.462  15.337   7.792  0.51 27.85
>           S
>
>    ...: HETATM 7918  N33AG5K C 702      -2.005  14.665   6.815  0.51 30.08
>           N
>
>    ...: """)
>
> In [7]: Chem.MolToSmiles(m_altlocA)
>
> Out[7]: 'CC(O)N1CC[C@H](NC2NC(CN3CCOCC3)CC(NC3NC4CCCNC4S3)N2)C1'
>
> Am I doing something wrong? I also attach this code as a .ipynb.
>
> Many thanks in advance!
>
> Susan
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to