Hi Mike,

I think you mean "organometallics", not "metallocenes" (the two molecules
in that SDF is are coordination complexes, but neither is a metallocene; I
stopped looking after that). The compounds are also drawn in such a way
that they are chemically unreasonable. This is pretty typical for
organometallics in V2000 mol files.

Unless you have a reliable source of input molecules and/or are willing to
look at every one, I would just filter anything that has a metal-nonmetal
bond out of the dataset.

If you really want to do something with the molecules:
The rdMolStandardize code, which is derived from MolVS, currently has one
approach for dealing with this type of complex: breaking all the covalent
bonds to the metal (this is also what InChI does). Given what a mess these
compounds are when they show up in most standard file formats, this seems
like a reasonable thing to do:

In [4]: from rdkit import Chem



In [5]: from rdkit.Chem.MolStandardize import rdMolStandardize



In [6]: dcon = rdMolStandardize.MetalDisconnector()


[14:34:03] Initializing MetalDisconnector

In [8]: suppl =
Chem.SDMolSupplier('/home/glandrum/Downloads/RDKit_input.sdf',sanitize=False,removeHs=False)



In [9]: m = suppl[0]



In [10]: om = dcon.Disconnect(m)


[14:34:29] Running MetalDisconnector
[14:34:29] Removed covalent bond between Tc and O
[14:34:29] Removed covalent bond between Tc and O
[14:34:29] Removed covalent bond between Tc and S
[14:34:29] Removed covalent bond between Tc and S
[14:34:29] Removed covalent bond between Tc and P
[14:34:29] Removed covalent bond between Tc and P

In [11]: Chem.SanitizeMol(om)


Out[11]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [12]: Chem.MolToSmiles(om)


Out[12]: 'CSCC[C@@H](NC(=O)[C@@H](CC(C)C)NC(=O)[C@
@H](Cc1cnc[nH]1)NC(=O)CNC(=O)[C@H](NC(=O)[C@@H](C)NC(=O)[C@H](CC(=O)[C@
@H](CCC(N)=O)NC(=O)CCCCNC(=O)CCCCC(CC[SH-]CCC[PH-](CO)CO)[SH-]CCC[PH-](CO)CO)c1cc2ccccc2[nH]1)C(C)C)C(N)=O.[99Tc+9].[Cl-].[O-2].[O-2]'


It's worth noting that this molecule is still a long way from making
chemical sense : the +9 charge on the Tc and the [SH-] and [PH-] groups are
not sensible. So there's more manual fixing required here.


Best,
-greg


On Mon, Oct 7, 2019 at 12:06 PM Mike Mazanetz <mi...@novadatasolutions.co.uk>
wrote:

> Hello RDKit experts !
>
>
>
> Is there a function to handle metallocenes in the standardizer?
>
>
>
> I’ve enclosed some examples of compounds.
>
>
>
> Thanks,
>
> mike
>
>
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to