Dear Mike,
Try changing all metal-ligand bonds to "dative" or "ionic, and
standardize afterwards (but disable adjusting of implicit Hs). This
way, I was able to process (in KNIME) >99% of organometallics (incl.
metallocenes) downloaded from Reaxys.
Example snippet (which doesn't check the "directionality" of the bond, though):

from rdkit import Chem
import pandas as pd
metals=['Ti','Al','Mo','Ru','Co','Rh', 'Ir', 'Ni','Zr', 'Hf', 'W']
outmols=[]
mols=input_table['Molecule']
for mol in mols:
    for bond in mol.GetBonds():
         if bond.GetEndAtom().GetSymbol() in metals or
bond.GetBeginAtom().GetSymbol() in metals:
              print("found metal-ligand bond")
              print("original type: "+ str(bond.GetBondType()))
              btype=Chem.rdchem.BondType.DATIVE
              bond.SetBondType(btype)
              print("changed to: "+
str(mol.GetBonds()[bond.GetIdx()].GetBondType()))
              try:

Chem.SanitizeMol(mol,sanitizeOps=Chem.SanitizeFlags.SANITIZE_ALL^Chem.SanitizeFlags.SANITIZE_ADJUSTHS)
              except ValueError as ve:
                  print("Sanitization failed")
                  print(ve)
output_table = input_table.copy()

Best,
Michal



On Mon, 7 Oct 2019 at 13:45, Greg Landrum <greg.land...@gmail.com> wrote:
>
> Hi Mike,
>
> I think you mean "organometallics", not "metallocenes" (the two molecules in 
> that SDF is are coordination complexes, but neither is a metallocene; I 
> stopped looking after that). The compounds are also drawn in such a way that 
> they are chemically unreasonable. This is pretty typical for organometallics 
> in V2000 mol files.
>
> Unless you have a reliable source of input molecules and/or are willing to 
> look at every one, I would just filter anything that has a metal-nonmetal 
> bond out of the dataset.
>
> If you really want to do something with the molecules:
> The rdMolStandardize code, which is derived from MolVS, currently has one 
> approach for dealing with this type of complex: breaking all the covalent 
> bonds to the metal (this is also what InChI does). Given what a mess these 
> compounds are when they show up in most standard file formats, this seems 
> like a reasonable thing to do:
>
> In [4]: from rdkit import Chem
>
> In [5]: from rdkit.Chem.MolStandardize import rdMolStandardize
>
> In [6]: dcon = rdMolStandardize.MetalDisconnector()
> [14:34:03] Initializing MetalDisconnector
>
> In [8]: suppl = 
> Chem.SDMolSupplier('/home/glandrum/Downloads/RDKit_input.sdf',sanitize=False,removeHs=False)
>
> In [9]: m = suppl[0]
>
> In [10]: om = dcon.Disconnect(m)
> [14:34:29] Running MetalDisconnector
> [14:34:29] Removed covalent bond between Tc and O
> [14:34:29] Removed covalent bond between Tc and O
> [14:34:29] Removed covalent bond between Tc and S
> [14:34:29] Removed covalent bond between Tc and S
> [14:34:29] Removed covalent bond between Tc and P
> [14:34:29] Removed covalent bond between Tc and P
>
> In [11]: Chem.SanitizeMol(om)
> Out[11]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> In [12]: Chem.MolToSmiles(om)
> Out[12]: 
> 'CSCC[C@@H](NC(=O)[C@@H](CC(C)C)NC(=O)[C@@H](Cc1cnc[nH]1)NC(=O)CNC(=O)[C@H](NC(=O)[C@@H](C)NC(=O)[C@H](CC(=O)[C@@H](CCC(N)=O)NC(=O)CCCCNC(=O)CCCCC(CC[SH-]CCC[PH-](CO)CO)[SH-]CCC[PH-](CO)CO)c1cc2ccccc2[nH]1)C(C)C)C(N)=O.[99Tc+9].[Cl-].[O-2].[O-2]'
>
>
> It's worth noting that this molecule is still a long way from making chemical 
> sense : the +9 charge on the Tc and the [SH-] and [PH-] groups are not 
> sensible. So there's more manual fixing required here.
>
>
> Best,
> -greg
>
>
> On Mon, Oct 7, 2019 at 12:06 PM Mike Mazanetz <mi...@novadatasolutions.co.uk> 
> wrote:
>>
>> Hello RDKit experts !
>>
>>
>>
>> Is there a function to handle metallocenes in the standardizer?
>>
>>
>>
>> I’ve enclosed some examples of compounds.
>>
>>
>>
>> Thanks,
>>
>> mike
>>
>>
>>
>>
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to