Hello! I have a pdb block that I am working with, which is attached to this email. The ligand has aromatic ring structures in it; however, when it is read into RDKit and converted into a smiles string, the aromatic rings are converted into aliphatic rings. Any thoughts?
Here is the python code: def extract_data( filename): extracted_info = "" with open(filename) as f: for line in f.readlines(): if "HETATM" in line: extracted_info += ( line) return extracted_info for index, filename in enumerate(solution_pdb_filenames): row = extract_data( filename) m = Chem.MolFromPDBBlock(row, sanitize=True, removeHs=False ) Chem.SetHybridization(m) Chem.SetAromaticity(m) Chem.SanitizeMol(m, sanitizeOps=Chem.rdmolops.SanitizeFlags.SANITIZE_ALL) #not needed since sanitizing during read in, but trying to figure out if it actually worked print ("Parsing file " + str(index) + " of " + str(len(solution_pdb_filenames))) print (Chem.MolToSmiles(m, kekuleSmiles=True, allHsExplicit=True)) The output smile string is: [H][O][CH]1[NH][CH]([C]([H])([H])[CH]([OH])[OH])[CH]([C]([H])([H])[C]([H])([H])[H])[CH]([CH]([OH])[CH]2[CH]([H])[CH]([H])[CH]([H])[CH]([N]([H])[H])[CH]2[H])[CH]1[N]([C]([H])([H])[H])[C]([H])([H])[H] Steven Combs
test.pdb
Description: Protein Databank data
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss