Thanks everyone for your valuable inputs.

Chem.SDMolSupplier('lig.sdf', sanitize=False)
worked well at the moment for my compounds. I am not sure if I will have
problem in calculating descriptors or further in my calculations.
I will also try to turn off the strict property checking.

Thanks,
Sundar

On Thu, Dec 14, 2017 at 1:14 AM, Greg Landrum <greg.land...@gmail.com>
wrote:

>
>
> On Thu, Dec 14, 2017 at 6:35 AM, Francois BERENGER <
> beren...@bioreg.kyushu-u.ac.jp> wrote:
>
>> On 12/14/2017 02:10 PM, Greg Landrum wrote:
>> >
>> > On Thu, Dec 14, 2017 at 4:22 AM, Francois BERENGER
>> > <beren...@bioreg.kyushu-u.ac.jp <mailto:beren...@bioreg.kyushu-u.ac.jp
>> >>
>> > wrote:
>> >
>> >     On 12/14/2017 05:15 AM, Sundar wrote:
>> >     > Hi RDkit users,
>> >     >
>> >     > I encounter following sanitize issue while I was trying to load
>> an SD
>> >     > file using
>> >     > Chem.SDMolSupplier('lig.sdf')
>> >     >
>> >     > Explicit valence for atom # 16 N, 4, is greater than permitted
>> >     > ERROR: Could not sanitize molecule ending on line 3145
>> >
>> >     I also encounter this exact error sometimes.
>> >
>> >     Is there a way to tell rdkit to automatically correct this atom
>> type?
>> >
>> >
>> > The code currently only automatically corrects cases where it's really,
>> > really obvious what the correction should be, like C-N(=O)=O ->
>> > C-[N+](=O)[O-].
>>
>> Where is this in the code?
>> I might have a look one day.
>>
>
> It's here:
> https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/MolOps.cpp#L194
>
> >
>> > The philosophy taken in the RDKit is that it's better to have a bad
>> > structure be rejected than it is to try and learn from it.
>> > If you disagree with this, it is pretty easy to switch off the
>> > sanitization checks and keep the bad molecules.
>>
>> I understand. I also guess unsanitized molecules would make some things
>> crash, just later.
>
>
> That depends. You can turn off the strict property checking:
>
> In [*2*]: m = Chem.MolFromSmiles('C1CCN1(C)C')
>
> [08:09:23] Explicit valence for atom # 3 N, 4, is greater than permitted
>
>
> In [*3*]: m = Chem.MolFromSmiles('C1CCN1(C)C',sanitize=*False*)
>
>
> In [*6*]: m.UpdatePropertyCache(strict=*False*)
>
>
> In [*7*]: Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^
> Chem.SANITIZE_PROPERTIES)
>
> Out[*7*]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
>
> In [*8*]: Chem.MolToSmiles(m)
>
> Out[*8*]: 'CN1(C)CCC1'
>
>
> or if you want to be more aggressive you can also turn off the cleanup
> that "fixes" those odd structures:
>
> In [*9*]: m = Chem.MolFromSmiles('CCCN(=O)=O',sanitize=*False*)
>
>
> In [*10*]: m.UpdatePropertyCache(strict=*False*)
>
>
> In [*11*]: Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^
> Chem.SANITIZE_PROPERTIES^Chem.SANITIZE_CLEANUP)
>
> Out[*11*]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
>
> In [*12*]: Chem.MolToSmiles(m)
> Out[*12*]: 'CCCN(=O)=O'
>
> In either case, many standard molecular operations should still work,
> you'll just be operating on molecules with atoms in unreasonable valence
> states.
>
> -greg
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to