Dear Paolo,

Great makes sense, thanks so much for the explanation and solution!

Susan

On Wed, Dec 9, 2020 at 9:21 PM Paolo Tosco <paolo.tosco.m...@gmail.com>
wrote:

> Dear Susan,
>
> the reason is that PDBAtomLine() ignores records where the alternate
> location is different from ' ', 'A' or '1':
>
>
> https://github.com/rdkit/rdkit/blob/e7e17adc4ef822d2663fa6e1ba5b978512c7a8b4/Code/GraphMol/FileParsers/PDBParser.cpp#L62
>
> I have run myself in the past into PDB files which have a single alternate
> location which is none of the above (e.g., only 'B').
> To avoid the problem you may set flavor=1 in MolFromPDBBlock():
>
> m_altlocB = Chem.MolFromPDBBlock("""
> HETATM 7886  C1 BG5K C 702      -0.945  12.634  14.174  0.51 48.52
>   C
> HETATM 7887  C2 BG5K C 702      -0.880  12.457  12.854  0.51 49.59
>   C
> HETATM 7888  C3 BG5K C 702      -2.175  12.105  12.250  0.51 50.84
>   C
> HETATM 7889  O4 BG5K C 702      -3.162  12.016  12.973  0.51 56.22
>   O
> HETATM 7890  N5 BG5K C 702      -2.196  11.880  10.884  0.51 46.36
>   N
> HETATM 7891  C6 BG5K C 702      -3.381  11.554  10.123  0.51 42.61
>   C
> HETATM 7892  C7 BG5K C 702      -2.994  11.820   8.683  0.51 34.05
>   C
> HETATM 7893  C8 BG5K C 702      -1.767  12.664   8.829  0.51 37.06
>   C
> HETATM 7894  C9 BG5K C 702      -1.088  12.006   9.994  0.51 39.00
>   C
> HETATM 7895  N10BG5K C 702      -0.970  12.778   7.632  0.51 32.64
>   N
> HETATM 7896  C11BG5K C 702      -0.892  13.901   6.897  0.51 31.98
>   C
> HETATM 7897  N12BG5K C 702       0.291  14.135   6.316  0.51 32.89
>   N
> HETATM 7898  C13BG5K C 702       0.365  15.236   5.562  0.51 37.92
>   C
> HETATM 7899  C14BG5K C 702       1.667  15.577   4.898  0.51 45.61
>   C
> HETATM 7900  N15BG5K C 702       2.013  14.870   3.674  0.51 54.36
>   N
> HETATM 7901  C16BG5K C 702       2.736  13.630   3.932  0.51 61.64
>   C
> HETATM 7902  C17BG5K C 702       3.129  13.030   2.594  0.51 67.85
>   C
> HETATM 7903  O18BG5K C 702       1.993  12.804   1.757  0.51 70.80
>   O
> HETATM 7904  C19BG5K C 702       1.368  14.060   1.475  0.51 61.74
>   C
> HETATM 7905  C20BG5K C 702       0.884  14.647   2.784  0.51 59.26
>   C
> HETATM 7906  C21BG5K C 702      -0.704  16.091   5.410  0.51 30.62
>   C
> HETATM 7907  C22BG5K C 702      -1.867  15.765   6.060  0.51 27.18
>   C
> HETATM 7908  N23BG5K C 702      -3.002  16.585   5.957  0.51 24.96
>   N
> HETATM 7909  C24BG5K C 702      -4.148  16.590   6.655  0.51 25.09
>   C
> HETATM 7910  N25BG5K C 702      -5.074  17.504   6.513  0.51 23.42
>   N
> HETATM 7911  C26BG5K C 702      -6.122  17.226   7.351  0.51 23.35
>   C
> HETATM 7912  C27BG5K C 702      -7.284  17.952   7.501  0.51 24.35
>   C
> HETATM 7913  C28BG5K C 702      -8.201  17.482   8.421  0.51 25.73
>   C
> HETATM 7914  C29BG5K C 702      -7.973  16.341   9.153  0.51 25.63
>   C
> HETATM 7915  N30BG5K C 702      -6.855  15.637   9.017  0.51 29.00
>   N
> HETATM 7916  C31BG5K C 702      -5.966  16.100   8.129  0.51 25.96
>   C
> HETATM 7917  S32BG5K C 702      -4.462  15.337   7.792  0.51 27.85
>   S
> HETATM 7918  N33BG5K C 702      -2.005  14.665   6.815  0.51 30.08
>   N
> """, flavor=1)
> Chem.MolToSmiles(m_altlocB)
>
> 'CC(O)N1CC[C@H](NC2NC(CN3CCOCC3)CC(NC3NC4CCCNC4S3)N2)C1'
>
>
> Cheers,
> p.
>
> On Wed, Dec 9, 2020 at 6:07 PM Susan Leung <susanhle...@gmail.com> wrote:
>
>> Hi all!
>>
>> I'm having problems reading in a PDB file with altloc B.... See below
>> code that tries to read in residue 702, chain C in 4kio.pdb (
>> https://files.rcsb.org/view/4KIO.pdb) . A mol object is returned but it
>> hasn’t got anything in it (or at least it returns an empty string when I do
>> Chem.MolFromSmiles() on it.)
>>
>> In [1]: import rdkit
>>
>> In [2]: from rdkit import Chem
>>
>> In [3]: rdkit.__version__
>>
>> Out[3]: '2019.03.2'
>>
>> In [4]: m_altlocB = Chem.MolFromPDBBlock("""
>>
>>    ...: HETATM 7886  C1 BG5K C 702      -0.945  12.634  14.174  0.51
>> 48.52           C
>>
>>    ...: HETATM 7887  C2 BG5K C 702      -0.880  12.457  12.854  0.51
>> 49.59           C
>>
>>    ...: HETATM 7888  C3 BG5K C 702      -2.175  12.105  12.250  0.51
>> 50.84           C
>>
>>    ...: HETATM 7889  O4 BG5K C 702      -3.162  12.016  12.973  0.51
>> 56.22           O
>>
>>    ...: HETATM 7890  N5 BG5K C 702      -2.196  11.880  10.884  0.51
>> 46.36           N
>>
>>    ...: HETATM 7891  C6 BG5K C 702      -3.381  11.554  10.123  0.51
>> 42.61           C
>>
>>    ...: HETATM 7892  C7 BG5K C 702      -2.994  11.820   8.683  0.51
>> 34.05           C
>>
>>    ...: HETATM 7893  C8 BG5K C 702      -1.767  12.664   8.829  0.51
>> 37.06           C
>>
>>    ...: HETATM 7894  C9 BG5K C 702      -1.088  12.006   9.994  0.51
>> 39.00           C
>>
>>    ...: HETATM 7895  N10BG5K C 702      -0.970  12.778   7.632  0.51
>> 32.64           N
>>
>>    ...: HETATM 7896  C11BG5K C 702      -0.892  13.901   6.897  0.51
>> 31.98           C
>>
>>    ...: HETATM 7897  N12BG5K C 702       0.291  14.135   6.316  0.51
>> 32.89           N
>>
>>    ...: HETATM 7898  C13BG5K C 702       0.365  15.236   5.562  0.51
>> 37.92           C
>>
>>    ...: HETATM 7899  C14BG5K C 702       1.667  15.577   4.898  0.51
>> 45.61           C
>>
>>    ...: HETATM 7900  N15BG5K C 702       2.013  14.870   3.674  0.51
>> 54.36           N
>>
>>    ...: HETATM 7901  C16BG5K C 702       2.736  13.630   3.932  0.51
>> 61.64           C
>>
>>    ...: HETATM 7902  C17BG5K C 702       3.129  13.030   2.594  0.51
>> 67.85           C
>>
>>    ...: HETATM 7903  O18BG5K C 702       1.993  12.804   1.757  0.51
>> 70.80           O
>>
>>    ...: HETATM 7904  C19BG5K C 702       1.368  14.060   1.475  0.51
>> 61.74           C
>>
>>    ...: HETATM 7905  C20BG5K C 702       0.884  14.647   2.784  0.51
>> 59.26           C
>>
>>    ...: HETATM 7906  C21BG5K C 702      -0.704  16.091   5.410  0.51
>> 30.62           C
>>
>>    ...: HETATM 7907  C22BG5K C 702      -1.867  15.765   6.060  0.51
>> 27.18           C
>>
>>    ...: HETATM 7908  N23BG5K C 702      -3.002  16.585   5.957  0.51
>> 24.96           N
>>
>>    ...: HETATM 7909  C24BG5K C 702      -4.148  16.590   6.655  0.51
>> 25.09           C
>>
>>    ...: HETATM 7910  N25BG5K C 702      -5.074  17.504   6.513  0.51
>> 23.42           N
>>
>>    ...: HETATM 7911  C26BG5K C 702      -6.122  17.226   7.351  0.51
>> 23.35           C
>>
>>    ...: HETATM 7912  C27BG5K C 702      -7.284  17.952   7.501  0.51
>> 24.35           C
>>
>>    ...: HETATM 7913  C28BG5K C 702      -8.201  17.482   8.421  0.51
>> 25.73           C
>>
>>    ...: HETATM 7914  C29BG5K C 702      -7.973  16.341   9.153  0.51
>> 25.63           C
>>
>>    ...: HETATM 7915  N30BG5K C 702      -6.855  15.637   9.017  0.51
>> 29.00           N
>>
>>    ...: HETATM 7916  C31BG5K C 702      -5.966  16.100   8.129  0.51
>> 25.96           C
>>
>>    ...: HETATM 7917  S32BG5K C 702      -4.462  15.337   7.792  0.51
>> 27.85           S
>>
>>    ...: HETATM 7918  N33BG5K C 702      -2.005  14.665   6.815  0.51
>> 30.08           N
>>
>>    ...: """)
>>
>> In [5]: Chem.MolToSmiles(m_altlocB)
>>
>> Out[5]: ''
>>
>> But if I change the altloc column which is column 17 from B to A then it
>> reads in and prints the SMILES fine.
>>
>> In [6]: m_altlocA = Chem.MolFromPDBBlock("""
>>
>>    ...: HETATM 7886  C1 AG5K C 702      -0.945  12.634  14.174  0.51
>> 48.52           C
>>
>>    ...: HETATM 7887  C2 AG5K C 702      -0.880  12.457  12.854  0.51
>> 49.59           C
>>
>>    ...: HETATM 7888  C3 AG5K C 702      -2.175  12.105  12.250  0.51
>> 50.84           C
>>
>>    ...: HETATM 7889  O4 AG5K C 702      -3.162  12.016  12.973  0.51
>> 56.22           O
>>
>>    ...: HETATM 7890  N5 AG5K C 702      -2.196  11.880  10.884  0.51
>> 46.36           N
>>
>>    ...: HETATM 7891  C6 AG5K C 702      -3.381  11.554  10.123  0.51
>> 42.61           C
>>
>>    ...: HETATM 7892  C7 AG5K C 702      -2.994  11.820   8.683  0.51
>> 34.05           C
>>
>>    ...: HETATM 7893  C8 AG5K C 702      -1.767  12.664   8.829  0.51
>> 37.06           C
>>
>>    ...: HETATM 7894  C9 AG5K C 702      -1.088  12.006   9.994  0.51
>> 39.00           C
>>
>>    ...: HETATM 7895  N10AG5K C 702      -0.970  12.778   7.632  0.51
>> 32.64           N
>>
>>    ...: HETATM 7896  C11AG5K C 702      -0.892  13.901   6.897  0.51
>> 31.98           C
>>
>>    ...: HETATM 7897  N12AG5K C 702       0.291  14.135   6.316  0.51
>> 32.89           N
>>
>>    ...: HETATM 7898  C13AG5K C 702       0.365  15.236   5.562  0.51
>> 37.92           C
>>
>>    ...: HETATM 7899  C14AG5K C 702       1.667  15.577   4.898  0.51
>> 45.61           C
>>
>>    ...: HETATM 7900  N15AG5K C 702       2.013  14.870   3.674  0.51
>> 54.36           N
>>
>>    ...: HETATM 7901  C16AG5K C 702       2.736  13.630   3.932  0.51
>> 61.64           C
>>
>>    ...: HETATM 7902  C17AG5K C 702       3.129  13.030   2.594  0.51
>> 67.85           C
>>
>>    ...: HETATM 7903  O18AG5K C 702       1.993  12.804   1.757  0.51
>> 70.80           O
>>
>>    ...: HETATM 7904  C19AG5K C 702       1.368  14.060   1.475  0.51
>> 61.74           C
>>
>>    ...: HETATM 7905  C20AG5K C 702       0.884  14.647   2.784  0.51
>> 59.26           C
>>
>>    ...: HETATM 7906  C21AG5K C 702      -0.704  16.091   5.410  0.51
>> 30.62           C
>>
>>    ...: HETATM 7907  C22AG5K C 702      -1.867  15.765   6.060  0.51
>> 27.18           C
>>
>>    ...: HETATM 7908  N23AG5K C 702      -3.002  16.585   5.957  0.51
>> 24.96           N
>>
>>    ...: HETATM 7909  C24AG5K C 702      -4.148  16.590   6.655  0.51
>> 25.09           C
>>
>>    ...: HETATM 7910  N25AG5K C 702      -5.074  17.504   6.513  0.51
>> 23.42           N
>>
>>    ...: HETATM 7911  C26AG5K C 702      -6.122  17.226   7.351  0.51
>> 23.35           C
>>
>>    ...: HETATM 7912  C27AG5K C 702      -7.284  17.952   7.501  0.51
>> 24.35           C
>>
>>    ...: HETATM 7913  C28AG5K C 702      -8.201  17.482   8.421  0.51
>> 25.73           C
>>
>>    ...: HETATM 7914  C29AG5K C 702      -7.973  16.341   9.153  0.51
>> 25.63           C
>>
>>    ...: HETATM 7915  N30AG5K C 702      -6.855  15.637   9.017  0.51
>> 29.00           N
>>
>>    ...: HETATM 7916  C31AG5K C 702      -5.966  16.100   8.129  0.51
>> 25.96           C
>>
>>    ...: HETATM 7917  S32AG5K C 702      -4.462  15.337   7.792  0.51
>> 27.85           S
>>
>>    ...: HETATM 7918  N33AG5K C 702      -2.005  14.665   6.815  0.51
>> 30.08           N
>>
>>    ...: """)
>>
>> In [7]: Chem.MolToSmiles(m_altlocA)
>>
>> Out[7]: 'CC(O)N1CC[C@H](NC2NC(CN3CCOCC3)CC(NC3NC4CCCNC4S3)N2)C1'
>>
>> Am I doing something wrong? I also attach this code as a .ipynb.
>>
>> Many thanks in advance!
>>
>> Susan
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to