Hi Mike,
I think you mean "organometallics", not "metallocenes" (the two molecules
in that SDF is are coordination complexes, but neither is a metallocene; I
stopped looking after that). The compounds are also drawn in such a way
that they are chemically unreasonable. This is pretty typical for
organometallics in V2000 mol files.
Unless you have a reliable source of input molecules and/or are willing to
look at every one, I would just filter anything that has a metal-nonmetal
bond out of the dataset.
If you really want to do something with the molecules:
The rdMolStandardize code, which is derived from MolVS, currently has one
approach for dealing with this type of complex: breaking all the covalent
bonds to the metal (this is also what InChI does). Given what a mess these
compounds are when they show up in most standard file formats, this seems
like a reasonable thing to do:
In [4]: from rdkit import Chem
In [5]: from rdkit.Chem.MolStandardize import rdMolStandardize
In [6]: dcon = rdMolStandardize.MetalDisconnector()
[14:34:03] Initializing MetalDisconnector
In [8]: suppl =
Chem.SDMolSupplier('/home/glandrum/Downloads/RDKit_input.sdf',sanitize=False,removeHs=False)
In [9]: m = suppl[0]
In [10]: om = dcon.Disconnect(m)
[14:34:29] Running MetalDisconnector
[14:34:29] Removed covalent bond between Tc and O
[14:34:29] Removed covalent bond between Tc and O
[14:34:29] Removed covalent bond between Tc and S
[14:34:29] Removed covalent bond between Tc and S
[14:34:29] Removed covalent bond between Tc and P
[14:34:29] Removed covalent bond between Tc and P
In [11]: Chem.SanitizeMol(om)
Out[11]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
In [12]: Chem.MolToSmiles(om)
Out[12]: 'CSCC[C@@H](NC(=O)[C@@H](CC(C)C)NC(=O)[C@
@H](Cc1cnc[nH]1)NC(=O)CNC(=O)[C@H](NC(=O)[C@@H](C)NC(=O)[C@H](CC(=O)[C@
@H](CCC(N)=O)NC(=O)CCCCNC(=O)CCCCC(CC[SH-]CCC[PH-](CO)CO)[SH-]CCC[PH-](CO)CO)c1cc2ccccc2[nH]1)C(C)C)C(N)=O.[99Tc+9].[Cl-].[O-2].[O-2]'
It's worth noting that this molecule is still a long way from making
chemical sense : the +9 charge on the Tc and the [SH-] and [PH-] groups are
not sensible. So there's more manual fixing required here.
Best,
-greg
On Mon, Oct 7, 2019 at 12:06 PM Mike Mazanetz <[email protected]>
wrote:
> Hello RDKit experts !
>
>
>
> Is there a function to handle metallocenes in the standardizer?
>
>
>
> I’ve enclosed some examples of compounds.
>
>
>
> Thanks,
>
> mike
>
>
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss