On Thu, Dec 14, 2017 at 6:35 AM, Francois BERENGER <
beren...@bioreg.kyushu-u.ac.jp> wrote:

> On 12/14/2017 02:10 PM, Greg Landrum wrote:
> >
> > On Thu, Dec 14, 2017 at 4:22 AM, Francois BERENGER
> > <beren...@bioreg.kyushu-u.ac.jp <mailto:beren...@bioreg.kyushu-u.ac.jp>>
> > wrote:
> >
> >     On 12/14/2017 05:15 AM, Sundar wrote:
> >     > Hi RDkit users,
> >     >
> >     > I encounter following sanitize issue while I was trying to load an
> SD
> >     > file using
> >     > Chem.SDMolSupplier('lig.sdf')
> >     >
> >     > Explicit valence for atom # 16 N, 4, is greater than permitted
> >     > ERROR: Could not sanitize molecule ending on line 3145
> >
> >     I also encounter this exact error sometimes.
> >
> >     Is there a way to tell rdkit to automatically correct this atom type?
> >
> >
> > The code currently only automatically corrects cases where it's really,
> > really obvious what the correction should be, like C-N(=O)=O ->
> > C-[N+](=O)[O-].
>
> Where is this in the code?
> I might have a look one day.
>

It's here:
https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/MolOps.cpp#L194

>
> > The philosophy taken in the RDKit is that it's better to have a bad
> > structure be rejected than it is to try and learn from it.
> > If you disagree with this, it is pretty easy to switch off the
> > sanitization checks and keep the bad molecules.
>
> I understand. I also guess unsanitized molecules would make some things
> crash, just later.


That depends. You can turn off the strict property checking:

In [*2*]: m = Chem.MolFromSmiles('C1CCN1(C)C')

[08:09:23] Explicit valence for atom # 3 N, 4, is greater than permitted


In [*3*]: m = Chem.MolFromSmiles('C1CCN1(C)C',sanitize=*False*)


In [*6*]: m.UpdatePropertyCache(strict=*False*)


In [*7*]:
Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_PROPERTIES)

Out[*7*]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE


In [*8*]: Chem.MolToSmiles(m)

Out[*8*]: 'CN1(C)CCC1'


or if you want to be more aggressive you can also turn off the cleanup that
"fixes" those odd structures:

In [*9*]: m = Chem.MolFromSmiles('CCCN(=O)=O',sanitize=*False*)


In [*10*]: m.UpdatePropertyCache(strict=*False*)


In [*11*]:
Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_PROPERTIES^Chem.SANITIZE_CLEANUP)

Out[*11*]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE


In [*12*]: Chem.MolToSmiles(m)
Out[*12*]: 'CCCN(=O)=O'

In either case, many standard molecular operations should still work,
you'll just be operating on molecules with atoms in unreasonable valence
states.

-greg
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to