On 13/01/14 17:54, JP wrote: > RDKitters! > > Finally back on the mailing list! > > I am sure we've been through this at the UGM (my mind must have > wandered off!), but a quick question about the PDB reader and bond > perception. Is this supported with the current PDB reader? I > remember that someone (PaulE, perhaps?) was saying bond perception was > painful, but there was some dictionary for PDB ligands which helps > (any idea the name of this dictionary?). > > To the technical details. > > I am reading in the following PDB file with a simple MolFromPDBFile() > call: > > HETATM 1 O1P 84T A1862 -27.016 9.387 -72.564 1.00 20.81 > O > HETATM 2 P 84T A1862 -27.282 9.818 -73.968 1.00 19.65 > P > HETATM 3 O2P 84T A1862 -27.881 11.176 -74.182 1.00 21.49 > O > HETATM 4 N 84T A1862 -25.869 9.583 -74.813 1.00 19.78 > N > HETATM 5 C 84T A1862 -25.759 10.010 -76.075 1.00 19.97 > C > HETATM 6 CA 84T A1862 -24.493 9.748 -76.807 1.00 19.75 > C > HETATM 7 CB 84T A1862 -24.794 8.678 -77.847 1.00 19.73 > C > HETATM 8 CG 84T A1862 -23.571 8.324 -78.681 1.00 19.70 > C > HETATM 9 CD2 84T A1862 -23.309 9.519 -79.611 1.00 18.49 > C > HETATM 10 CD1 84T A1862 -23.863 6.932 -79.305 1.00 18.60 > C > HETATM 11 OHB 84T A1862 -25.210 7.467 -77.223 1.00 19.17 > O > HETATM 12 OH 84T A1862 -23.549 9.127 -75.984 1.00 20.33 > O > HETATM 13 O 84T A1862 -26.672 10.517 -76.692 1.00 20.26 > O > HETATM 14 O5' 84T A1862 -28.377 8.861 -74.619 1.00 19.39 > O > HETATM 15 C5' 84T A1862 -28.002 7.536 -74.954 1.00 18.47 > C > HETATM 16 C4' 84T A1862 -28.909 7.000 -76.012 1.00 18.24 > C > HETATM 17 C3' 84T A1862 -28.901 7.826 -77.298 1.00 18.28 > C > HETATM 18 C2' 84T A1862 -30.318 7.610 -77.768 1.00 18.69 > C > HETATM 19 O2' 84T A1862 -30.789 8.641 -78.581 1.00 19.64 > O > HETATM 20 O4' 84T A1862 -30.262 6.951 -75.529 1.00 18.80 > O > HETATM 21 C1' 84T A1862 -31.152 7.470 -76.521 1.00 19.01 > C > HETATM 22 N9 84T A1862 -31.753 8.732 -76.009 1.00 20.08 > N > HETATM 23 C4 84T A1862 -33.033 9.013 -76.158 1.00 21.10 > C > HETATM 24 N3 84T A1862 -34.018 8.339 -76.786 1.00 21.58 > N > HETATM 25 C2 84T A1862 -35.263 8.846 -76.830 1.00 21.95 > C > HETATM 26 C8 84T A1862 -31.223 9.701 -75.291 1.00 20.27 > C > HETATM 27 N7 84T A1862 -32.173 10.618 -75.019 1.00 21.28 > N > HETATM 28 C5 84T A1862 -33.315 10.213 -75.563 1.00 21.81 > C > HETATM 29 C6 84T A1862 -34.624 10.702 -75.627 1.00 22.85 > C > HETATM 30 N1 84T A1862 -35.550 10.010 -76.285 1.00 22.44 > N > HETATM 31 N6 84T A1862 -35.008 11.862 -75.052 1.00 23.86 > N > TER > END > > But I am losing all the double bond (and aromatic) information: > > m = Chem.MolFromPDBFile(sys.argv[1]) > print Chem.MolToSmiles(m) > > Gives me: > > CC(C)C(O)C(O)C(O)NP(O)(O)OCC1CC(O)C(N2CNC3C2NCNC3N)O1 > > As usual, many thanks for your time,
84T is a reference to chemical description: http://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/84T This (mmcif) is what I parse, either from the local dictionary or downloading the file on the fly: ftp://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/files/mmcif/84T.cif Does that help? Paul. ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss