Dear Mike, Try changing all metal-ligand bonds to "dative" or "ionic, and standardize afterwards (but disable adjusting of implicit Hs). This way, I was able to process (in KNIME) >99% of organometallics (incl. metallocenes) downloaded from Reaxys. Example snippet (which doesn't check the "directionality" of the bond, though):
from rdkit import Chem import pandas as pd metals=['Ti','Al','Mo','Ru','Co','Rh', 'Ir', 'Ni','Zr', 'Hf', 'W'] outmols=[] mols=input_table['Molecule'] for mol in mols: for bond in mol.GetBonds(): if bond.GetEndAtom().GetSymbol() in metals or bond.GetBeginAtom().GetSymbol() in metals: print("found metal-ligand bond") print("original type: "+ str(bond.GetBondType())) btype=Chem.rdchem.BondType.DATIVE bond.SetBondType(btype) print("changed to: "+ str(mol.GetBonds()[bond.GetIdx()].GetBondType())) try: Chem.SanitizeMol(mol,sanitizeOps=Chem.SanitizeFlags.SANITIZE_ALL^Chem.SanitizeFlags.SANITIZE_ADJUSTHS) except ValueError as ve: print("Sanitization failed") print(ve) output_table = input_table.copy() Best, Michal On Mon, 7 Oct 2019 at 13:45, Greg Landrum <greg.land...@gmail.com> wrote: > > Hi Mike, > > I think you mean "organometallics", not "metallocenes" (the two molecules in > that SDF is are coordination complexes, but neither is a metallocene; I > stopped looking after that). The compounds are also drawn in such a way that > they are chemically unreasonable. This is pretty typical for organometallics > in V2000 mol files. > > Unless you have a reliable source of input molecules and/or are willing to > look at every one, I would just filter anything that has a metal-nonmetal > bond out of the dataset. > > If you really want to do something with the molecules: > The rdMolStandardize code, which is derived from MolVS, currently has one > approach for dealing with this type of complex: breaking all the covalent > bonds to the metal (this is also what InChI does). Given what a mess these > compounds are when they show up in most standard file formats, this seems > like a reasonable thing to do: > > In [4]: from rdkit import Chem > > In [5]: from rdkit.Chem.MolStandardize import rdMolStandardize > > In [6]: dcon = rdMolStandardize.MetalDisconnector() > [14:34:03] Initializing MetalDisconnector > > In [8]: suppl = > Chem.SDMolSupplier('/home/glandrum/Downloads/RDKit_input.sdf',sanitize=False,removeHs=False) > > In [9]: m = suppl[0] > > In [10]: om = dcon.Disconnect(m) > [14:34:29] Running MetalDisconnector > [14:34:29] Removed covalent bond between Tc and O > [14:34:29] Removed covalent bond between Tc and O > [14:34:29] Removed covalent bond between Tc and S > [14:34:29] Removed covalent bond between Tc and S > [14:34:29] Removed covalent bond between Tc and P > [14:34:29] Removed covalent bond between Tc and P > > In [11]: Chem.SanitizeMol(om) > Out[11]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE > > In [12]: Chem.MolToSmiles(om) > Out[12]: > 'CSCC[C@@H](NC(=O)[C@@H](CC(C)C)NC(=O)[C@@H](Cc1cnc[nH]1)NC(=O)CNC(=O)[C@H](NC(=O)[C@@H](C)NC(=O)[C@H](CC(=O)[C@@H](CCC(N)=O)NC(=O)CCCCNC(=O)CCCCC(CC[SH-]CCC[PH-](CO)CO)[SH-]CCC[PH-](CO)CO)c1cc2ccccc2[nH]1)C(C)C)C(N)=O.[99Tc+9].[Cl-].[O-2].[O-2]' > > > It's worth noting that this molecule is still a long way from making chemical > sense : the +9 charge on the Tc and the [SH-] and [PH-] groups are not > sensible. So there's more manual fixing required here. > > > Best, > -greg > > > On Mon, Oct 7, 2019 at 12:06 PM Mike Mazanetz <mi...@novadatasolutions.co.uk> > wrote: >> >> Hello RDKit experts ! >> >> >> >> Is there a function to handle metallocenes in the standardizer? >> >> >> >> I’ve enclosed some examples of compounds. >> >> >> >> Thanks, >> >> mike >> >> >> >> >> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss