Re: [Rdkit-discuss] Hypervalent halogen structures - chlorate etc.

2017-11-22 Thread Greg Landrum
On Wed, Nov 22, 2017 at 10:09 AM, Chris Earnshaw 
wrote:

> Thanks Greg, I suspected that might be the case. I certainly don't
> want to turn off all sanitization, but I could write a bit of code to
> convert these salts back to my desired form as a separate step.
>

It's worth thinking about adding customizable versions of the cleanup
function and accepted valence states to allow people to use alternate
chemistry representations. These are famous last words, but that doesn't
seem like it would be *that* hard.



> Alternatively, I guess I could disable or rewrite halogenCleanup() in
> MolOps.cpp. Is this the only function which would need to be changed?
>

I think so. And you'd need to add the additional acceptable valence states
in atomic_data.cpp

-greg



> Best regards,
> Chris
>
> On 22 November 2017 at 07:59, Greg Landrum  wrote:
> > At the moment the only way to do this is to disable the "cleanup"
> > functionality in SanitizeMol(). This can be done, but it will also have
> the
> > consequence that things like the hypervalent N in nitro groups (i.e.
> > "-N(=O)=O") is not cleaned up.
> >
> > In [2]: m = Chem.MolFromSmiles('O=Cl(=O)(=O)[O-]',sanitize=False)
> >
> > In [3]: m.UpdatePropertyCache(strict=False)
> >
> > In [4]:
> > Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^
> Chem.SANITIZE_CLEANUP^Chem.SANITIZE_PROPERTIES)
> > Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
> >
> > In [5]: Chem.MolToSmiles(m)
> > Out[5]: 'O=Cl(=O)(=O)[O-]'
> >
> > In [6]: m2 = Chem.MolFromSmiles('CN(=O)=O',sanitize=False)
> >
> > In [7]: m2.UpdatePropertyCache(strict=False)
> >
> > In [8]:
> > Chem.SanitizeMol(m2,sanitizeOps=Chem.SANITIZE_ALL^
> Chem.SANITIZE_CLEANUP^Chem.SANITIZE_PROPERTIES)
> > Out[8]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
> >
> > In [10]: Chem.MolToSmiles(m2)
> > Out[10]: 'CN(=O)=O'
> >
> >
> > Note that this way of doing things disables all "unreasonable" valence
> > checking.
> >
> > -greg
> >
> >
> > On Tue, Nov 21, 2017 at 10:12 AM, Chris Earnshaw 
> > wrote:
> >>
> >> Hi
> >>
> >> Sometime between 2014 and now there appears to have been a change in
> >> the way hypervalent halogen structures are handled. The old behaviour
> >> (involving some tweaking of atomic_data.cpp to allow the higher
> >> oxidation states) was to have a neutral halogen with double bonds to
> >> most of the oxygens, e.g.
> >> chlorate O=Cl(=O)[O-]
> >> perchlorate O=Cl(=O)(=O)[O-]
> >>
> >> The current behaviour is to 'charge separate' the dative bonds, giving
> >> chlorate [O-][Cl2+]([O+])[O-]
> >> perchlorate [O-][Cl3+]([O-])([O+])[O-]
> >>
> >> Although this may be regarded as 'correct' (arguable!), it  can cause
> >> problems of compatibility with other software and looks remarkably
> >> ugly. It's also inconsistent with the handling of hypervalent P and S
> >> compounds. Using the same convention, we should have -
> >>
> >> trimethylphosphine oxide C[P+]([O-])(C)C
> >> dimethylsulfoxide C[S+]([O-])C
> >> dimethylsulfone C[S2+]([O-])([O-])C
> >>
> >> - and I really don't want this to happen!
> >>
> >> Does anyone know a way to restore the old behaviour for chlorites,
> >> bromates, periodates etc.?
> >>
> >> Best regards,
> >> Chris Earnshaw
> >>
> >>
> >> 
> --
> >> Check out the vibrant tech community on one of the world's most
> >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >> ___
> >> Rdkit-discuss mailing list
> >> Rdkit-discuss@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >
> >
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Hypervalent halogen structures - chlorate etc.

2017-11-22 Thread Chris Earnshaw
Thanks Greg, I suspected that might be the case. I certainly don't
want to turn off all sanitization, but I could write a bit of code to
convert these salts back to my desired form as a separate step.

Alternatively, I guess I could disable or rewrite halogenCleanup() in
MolOps.cpp. Is this the only function which would need to be changed?

Best regards,
Chris

On 22 November 2017 at 07:59, Greg Landrum  wrote:
> At the moment the only way to do this is to disable the "cleanup"
> functionality in SanitizeMol(). This can be done, but it will also have the
> consequence that things like the hypervalent N in nitro groups (i.e.
> "-N(=O)=O") is not cleaned up.
>
> In [2]: m = Chem.MolFromSmiles('O=Cl(=O)(=O)[O-]',sanitize=False)
>
> In [3]: m.UpdatePropertyCache(strict=False)
>
> In [4]:
> Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_CLEANUP^Chem.SANITIZE_PROPERTIES)
> Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> In [5]: Chem.MolToSmiles(m)
> Out[5]: 'O=Cl(=O)(=O)[O-]'
>
> In [6]: m2 = Chem.MolFromSmiles('CN(=O)=O',sanitize=False)
>
> In [7]: m2.UpdatePropertyCache(strict=False)
>
> In [8]:
> Chem.SanitizeMol(m2,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_CLEANUP^Chem.SANITIZE_PROPERTIES)
> Out[8]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> In [10]: Chem.MolToSmiles(m2)
> Out[10]: 'CN(=O)=O'
>
>
> Note that this way of doing things disables all "unreasonable" valence
> checking.
>
> -greg
>
>
> On Tue, Nov 21, 2017 at 10:12 AM, Chris Earnshaw 
> wrote:
>>
>> Hi
>>
>> Sometime between 2014 and now there appears to have been a change in
>> the way hypervalent halogen structures are handled. The old behaviour
>> (involving some tweaking of atomic_data.cpp to allow the higher
>> oxidation states) was to have a neutral halogen with double bonds to
>> most of the oxygens, e.g.
>> chlorate O=Cl(=O)[O-]
>> perchlorate O=Cl(=O)(=O)[O-]
>>
>> The current behaviour is to 'charge separate' the dative bonds, giving
>> chlorate [O-][Cl2+]([O+])[O-]
>> perchlorate [O-][Cl3+]([O-])([O+])[O-]
>>
>> Although this may be regarded as 'correct' (arguable!), it  can cause
>> problems of compatibility with other software and looks remarkably
>> ugly. It's also inconsistent with the handling of hypervalent P and S
>> compounds. Using the same convention, we should have -
>>
>> trimethylphosphine oxide C[P+]([O-])(C)C
>> dimethylsulfoxide C[S+]([O-])C
>> dimethylsulfone C[S2+]([O-])([O-])C
>>
>> - and I really don't want this to happen!
>>
>> Does anyone know a way to restore the old behaviour for chlorites,
>> bromates, periodates etc.?
>>
>> Best regards,
>> Chris Earnshaw
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Hypervalent halogen structures - chlorate etc.

2017-11-22 Thread Greg Landrum
At the moment the only way to do this is to disable the "cleanup"
functionality in SanitizeMol(). This can be done, but it will also have the
consequence that things like the hypervalent N in nitro groups (i.e.
"-N(=O)=O") is not cleaned up.

In [2]: m = Chem.MolFromSmiles('O=Cl(=O)(=O)[O-]',sanitize=False)

In [3]: m.UpdatePropertyCache(strict=False)

In [4]:
Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_CLEANUP^Chem.SANITIZE_PROPERTIES)
Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [5]: Chem.MolToSmiles(m)
Out[5]: 'O=Cl(=O)(=O)[O-]'

In [6]: m2 = Chem.MolFromSmiles('CN(=O)=O',sanitize=False)

In [7]: m2.UpdatePropertyCache(strict=False)

In [8]:
Chem.SanitizeMol(m2,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_CLEANUP^Chem.SANITIZE_PROPERTIES)
Out[8]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [10]: Chem.MolToSmiles(m2)
Out[10]: 'CN(=O)=O'


Note that this way of doing things disables all "unreasonable" valence
checking.

-greg


On Tue, Nov 21, 2017 at 10:12 AM, Chris Earnshaw 
wrote:

> Hi
>
> Sometime between 2014 and now there appears to have been a change in
> the way hypervalent halogen structures are handled. The old behaviour
> (involving some tweaking of atomic_data.cpp to allow the higher
> oxidation states) was to have a neutral halogen with double bonds to
> most of the oxygens, e.g.
> chlorate O=Cl(=O)[O-]
> perchlorate O=Cl(=O)(=O)[O-]
>
> The current behaviour is to 'charge separate' the dative bonds, giving
> chlorate [O-][Cl2+]([O+])[O-]
> perchlorate [O-][Cl3+]([O-])([O+])[O-]
>
> Although this may be regarded as 'correct' (arguable!), it  can cause
> problems of compatibility with other software and looks remarkably
> ugly. It's also inconsistent with the handling of hypervalent P and S
> compounds. Using the same convention, we should have -
>
> trimethylphosphine oxide C[P+]([O-])(C)C
> dimethylsulfoxide C[S+]([O-])C
> dimethylsulfone C[S2+]([O-])([O-])C
>
> - and I really don't want this to happen!
>
> Does anyone know a way to restore the old behaviour for chlorites,
> bromates, periodates etc.?
>
> Best regards,
> Chris Earnshaw
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Hypervalent halogen structures - chlorate etc.

2017-11-21 Thread Chris Earnshaw
Hi

Oops - typo! The chlorate and perchlorate oxygens should all be [O-].
I was copying by hand from a different computer...

Regards,
Chris

On 21 November 2017 at 09:19, Paolo Tosco  wrote:
> Hi Chris,
>
> if the behaviour with chlorate and perchlorate is the one you report
>
> chlorate [O-][Cl2+]([O+])[O-]
> perchlorate [O-][Cl3+]([O-])([O+])[O-]
>
> it looks wrong to me as there is an overall formal charge of +1. All O's
> should bear a -1 charge.
>
> Cheers,
> p.
>
>
>
> On 11/21/17 09:12, Chris Earnshaw wrote:
>>
>> Hi
>>
>> Sometime between 2014 and now there appears to have been a change in
>> the way hypervalent halogen structures are handled. The old behaviour
>> (involving some tweaking of atomic_data.cpp to allow the higher
>> oxidation states) was to have a neutral halogen with double bonds to
>> most of the oxygens, e.g.
>> chlorate O=Cl(=O)[O-]
>> perchlorate O=Cl(=O)(=O)[O-]
>>
>> The current behaviour is to 'charge separate' the dative bonds, giving
>> chlorate [O-][Cl2+]([O+])[O-]
>> perchlorate [O-][Cl3+]([O-])([O+])[O-]
>>
>> Although this may be regarded as 'correct' (arguable!), it  can cause
>> problems of compatibility with other software and looks remarkably
>> ugly. It's also inconsistent with the handling of hypervalent P and S
>> compounds. Using the same convention, we should have -
>>
>> trimethylphosphine oxide C[P+]([O-])(C)C
>> dimethylsulfoxide C[S+]([O-])C
>> dimethylsulfone C[S2+]([O-])([O-])C
>>
>> - and I really don't want this to happen!
>>
>> Does anyone know a way to restore the old behaviour for chlorites,
>> bromates, periodates etc.?
>>
>> Best regards,
>> Chris Earnshaw
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss