Hi all! I'm having problems reading in a PDB file with altloc B.... See below code that tries to read in residue 702, chain C in 4kio.pdb ( https://files.rcsb.org/view/4KIO.pdb) . A mol object is returned but it hasn’t got anything in it (or at least it returns an empty string when I do Chem.MolFromSmiles() on it.)
In [1]: import rdkit In [2]: from rdkit import Chem In [3]: rdkit.__version__ Out[3]: '2019.03.2' In [4]: m_altlocB = Chem.MolFromPDBBlock(""" ...: HETATM 7886 C1 BG5K C 702 -0.945 12.634 14.174 0.51 48.52 C ...: HETATM 7887 C2 BG5K C 702 -0.880 12.457 12.854 0.51 49.59 C ...: HETATM 7888 C3 BG5K C 702 -2.175 12.105 12.250 0.51 50.84 C ...: HETATM 7889 O4 BG5K C 702 -3.162 12.016 12.973 0.51 56.22 O ...: HETATM 7890 N5 BG5K C 702 -2.196 11.880 10.884 0.51 46.36 N ...: HETATM 7891 C6 BG5K C 702 -3.381 11.554 10.123 0.51 42.61 C ...: HETATM 7892 C7 BG5K C 702 -2.994 11.820 8.683 0.51 34.05 C ...: HETATM 7893 C8 BG5K C 702 -1.767 12.664 8.829 0.51 37.06 C ...: HETATM 7894 C9 BG5K C 702 -1.088 12.006 9.994 0.51 39.00 C ...: HETATM 7895 N10BG5K C 702 -0.970 12.778 7.632 0.51 32.64 N ...: HETATM 7896 C11BG5K C 702 -0.892 13.901 6.897 0.51 31.98 C ...: HETATM 7897 N12BG5K C 702 0.291 14.135 6.316 0.51 32.89 N ...: HETATM 7898 C13BG5K C 702 0.365 15.236 5.562 0.51 37.92 C ...: HETATM 7899 C14BG5K C 702 1.667 15.577 4.898 0.51 45.61 C ...: HETATM 7900 N15BG5K C 702 2.013 14.870 3.674 0.51 54.36 N ...: HETATM 7901 C16BG5K C 702 2.736 13.630 3.932 0.51 61.64 C ...: HETATM 7902 C17BG5K C 702 3.129 13.030 2.594 0.51 67.85 C ...: HETATM 7903 O18BG5K C 702 1.993 12.804 1.757 0.51 70.80 O ...: HETATM 7904 C19BG5K C 702 1.368 14.060 1.475 0.51 61.74 C ...: HETATM 7905 C20BG5K C 702 0.884 14.647 2.784 0.51 59.26 C ...: HETATM 7906 C21BG5K C 702 -0.704 16.091 5.410 0.51 30.62 C ...: HETATM 7907 C22BG5K C 702 -1.867 15.765 6.060 0.51 27.18 C ...: HETATM 7908 N23BG5K C 702 -3.002 16.585 5.957 0.51 24.96 N ...: HETATM 7909 C24BG5K C 702 -4.148 16.590 6.655 0.51 25.09 C ...: HETATM 7910 N25BG5K C 702 -5.074 17.504 6.513 0.51 23.42 N ...: HETATM 7911 C26BG5K C 702 -6.122 17.226 7.351 0.51 23.35 C ...: HETATM 7912 C27BG5K C 702 -7.284 17.952 7.501 0.51 24.35 C ...: HETATM 7913 C28BG5K C 702 -8.201 17.482 8.421 0.51 25.73 C ...: HETATM 7914 C29BG5K C 702 -7.973 16.341 9.153 0.51 25.63 C ...: HETATM 7915 N30BG5K C 702 -6.855 15.637 9.017 0.51 29.00 N ...: HETATM 7916 C31BG5K C 702 -5.966 16.100 8.129 0.51 25.96 C ...: HETATM 7917 S32BG5K C 702 -4.462 15.337 7.792 0.51 27.85 S ...: HETATM 7918 N33BG5K C 702 -2.005 14.665 6.815 0.51 30.08 N ...: """) In [5]: Chem.MolToSmiles(m_altlocB) Out[5]: '' But if I change the altloc column which is column 17 from B to A then it reads in and prints the SMILES fine. In [6]: m_altlocA = Chem.MolFromPDBBlock(""" ...: HETATM 7886 C1 AG5K C 702 -0.945 12.634 14.174 0.51 48.52 C ...: HETATM 7887 C2 AG5K C 702 -0.880 12.457 12.854 0.51 49.59 C ...: HETATM 7888 C3 AG5K C 702 -2.175 12.105 12.250 0.51 50.84 C ...: HETATM 7889 O4 AG5K C 702 -3.162 12.016 12.973 0.51 56.22 O ...: HETATM 7890 N5 AG5K C 702 -2.196 11.880 10.884 0.51 46.36 N ...: HETATM 7891 C6 AG5K C 702 -3.381 11.554 10.123 0.51 42.61 C ...: HETATM 7892 C7 AG5K C 702 -2.994 11.820 8.683 0.51 34.05 C ...: HETATM 7893 C8 AG5K C 702 -1.767 12.664 8.829 0.51 37.06 C ...: HETATM 7894 C9 AG5K C 702 -1.088 12.006 9.994 0.51 39.00 C ...: HETATM 7895 N10AG5K C 702 -0.970 12.778 7.632 0.51 32.64 N ...: HETATM 7896 C11AG5K C 702 -0.892 13.901 6.897 0.51 31.98 C ...: HETATM 7897 N12AG5K C 702 0.291 14.135 6.316 0.51 32.89 N ...: HETATM 7898 C13AG5K C 702 0.365 15.236 5.562 0.51 37.92 C ...: HETATM 7899 C14AG5K C 702 1.667 15.577 4.898 0.51 45.61 C ...: HETATM 7900 N15AG5K C 702 2.013 14.870 3.674 0.51 54.36 N ...: HETATM 7901 C16AG5K C 702 2.736 13.630 3.932 0.51 61.64 C ...: HETATM 7902 C17AG5K C 702 3.129 13.030 2.594 0.51 67.85 C ...: HETATM 7903 O18AG5K C 702 1.993 12.804 1.757 0.51 70.80 O ...: HETATM 7904 C19AG5K C 702 1.368 14.060 1.475 0.51 61.74 C ...: HETATM 7905 C20AG5K C 702 0.884 14.647 2.784 0.51 59.26 C ...: HETATM 7906 C21AG5K C 702 -0.704 16.091 5.410 0.51 30.62 C ...: HETATM 7907 C22AG5K C 702 -1.867 15.765 6.060 0.51 27.18 C ...: HETATM 7908 N23AG5K C 702 -3.002 16.585 5.957 0.51 24.96 N ...: HETATM 7909 C24AG5K C 702 -4.148 16.590 6.655 0.51 25.09 C ...: HETATM 7910 N25AG5K C 702 -5.074 17.504 6.513 0.51 23.42 N ...: HETATM 7911 C26AG5K C 702 -6.122 17.226 7.351 0.51 23.35 C ...: HETATM 7912 C27AG5K C 702 -7.284 17.952 7.501 0.51 24.35 C ...: HETATM 7913 C28AG5K C 702 -8.201 17.482 8.421 0.51 25.73 C ...: HETATM 7914 C29AG5K C 702 -7.973 16.341 9.153 0.51 25.63 C ...: HETATM 7915 N30AG5K C 702 -6.855 15.637 9.017 0.51 29.00 N ...: HETATM 7916 C31AG5K C 702 -5.966 16.100 8.129 0.51 25.96 C ...: HETATM 7917 S32AG5K C 702 -4.462 15.337 7.792 0.51 27.85 S ...: HETATM 7918 N33AG5K C 702 -2.005 14.665 6.815 0.51 30.08 N ...: """) In [7]: Chem.MolToSmiles(m_altlocA) Out[7]: 'CC(O)N1CC[C@H](NC2NC(CN3CCOCC3)CC(NC3NC4CCCNC4S3)N2)C1' Am I doing something wrong? I also attach this code as a .ipynb. Many thanks in advance! Susan
rdkit_issue_with_altloc_B.ipynb
Description: Binary data
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss