[Rdkit-discuss] rdDeprotect & DeprotectData

2023-08-21 Thread Katrina Lexa
Hi All,

I don't know why I'm struggling so much with this, as it seems like it
should be pretty straight forward. I'm trying to add some additional
deprotection smirks to a data-cleaning python script and I'm not having
success with the new reactions actually transforming my reactants to
deprotected smiles. I have about 10 I'd like to add, so I know I could do
it with simple reactions, but I'd rather figure out where I'm going wrong
here.

My definition of deprotect data:
#deborylation
deprotection_class = "boron"
reaction_smarts =  "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"
abbreviation = "BOO"
full_name = "deboron"
bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts,
abbreviation, full_name)
assert bdata.isValid()

I tried adding this line:
newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata)

but it seems to make no difference:
try:
#result =
rdDeprotect.Deprotect(dep_m,deprotections=[bdata])
result = rdDeprotect.Deprotect(dep_m,newDeprotect)


As an example, this is one of the smiles strings in the smiles file I'm
reading in I would expect to deprotect"
Cc1cc(B(O)O)ccc1OC(C)C

Maybe I'm just awful at writing SMIRKS?


Thanks for the help here,

Katrina
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdDeprotect & DeprotectData

2023-08-21 Thread James Davidson
Hi Katrina,

I'm slightly unsure what "deprotection" you are trying to represent, but I 
think there are a couple of problems with the rsmarts...

  reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"

This is looking for an aromatic carbon with one hydrogen AND connected to a 
non-ring boron.  This pattern will never be found!
Also, you have a mapped atom on the reactant side, but no mapped atoms on the 
product side.

If your reaction is aiming to hydrolyse non-cyclic boronic esters (and return 
the alcohols), then you should map the oxygen atom on the product side as well 
- something like:

  reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]"

If, instead, you are interested in the virtual reaction that removes boronates 
from aryl R-groups (perhaps to calculate R-group fingerprints, etc) - then you 
should map the aryl carbon on both sides instead:

  reaction_smarts = "[c:1][B;R0](O)O>>[*:1]"

In either case you probably want to deduplicate products (the boronic acids and 
esters will match the pattern twice).

Kind regards

James

From: Katrina Lexa 
Sent: 21 August 2023 06:03
To: RDKit Discuss 
Subject: [Rdkit-discuss] rdDeprotect & DeprotectData

Hi All,

I don't know why I'm struggling so much with this, as it seems like it should 
be pretty straight forward. I'm trying to add some additional deprotection 
smirks to a data-cleaning python script and I'm not having success with the new 
reactions actually transforming my reactants to deprotected smiles. I have 
about 10 I'd like to add, so I know I could do it with simple reactions, but 
I'd rather figure out where I'm going wrong here.

My definition of deprotect data:
#deborylation
deprotection_class = "boron"
reaction_smarts =  "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"
abbreviation = "BOO"
full_name = "deboron"
bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts, 
abbreviation, full_name)
assert bdata.isValid()

I tried adding this line:
newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata)

but it seems to make no difference:
try:
#result = rdDeprotect.Deprotect(dep_m,deprotections=[bdata])
result = rdDeprotect.Deprotect(dep_m,newDeprotect)


As an example, this is one of the smiles strings in the smiles file I'm reading 
in I would expect to deprotect"
Cc1cc(B(O)O)ccc1OC(C)C

Maybe I'm just awful at writing SMIRKS?


Thanks for the help here,

Katrina



PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdDeprotect & DeprotectData

2023-08-21 Thread James Davidson
Hi Katrina,

I must confess I haven't actually used rdDeprotect before (I have always 
created reactions and called the RunReactants() method)...
I just tried your use case, and I think it is working as you would like (I 
can't immediately see what is wrong in the original code you posted).
Here is a gist showing it (I hope): 
https://gist.github.com/jepdavidson/ec1664a8bfa8b921262fc844c0e523e4

Kind regards

James

From: Katrina Lexa 
Sent: 21 August 2023 14:58
To: James Davidson 
Cc: RDKit Discuss 
Subject: Re: [Rdkit-discuss] rdDeprotect & DeprotectData

Hi James,

Thanks for the quick reply!

You're quite right, I'm simply interested in the virtual reaction to remove the 
boronates. Thank you for fixing my incorrect mapping. At some point, I had had 
the aryl carbon properly specified, but I clearly lost my way with it along my 
quest.

Sadly, the reaction_smarts = "[c:1][B;R0](O)O>>[*:1]" still does not remove any 
of the boronates from my input smiles, but it sounds like everything else about 
the specification of the reaction is correct, so I'll get there at some point 
with the right reaction_smarts.

Thanks again,

Katrina

On Mon, Aug 21, 2023 at 3:26 AM James Davidson 
mailto:j.david...@vernalis.com>> wrote:
Hi Katrina,

I'm slightly unsure what "deprotection" you are trying to represent, but I 
think there are a couple of problems with the rsmarts...

  reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"

This is looking for an aromatic carbon with one hydrogen AND connected to a 
non-ring boron.  This pattern will never be found!
Also, you have a mapped atom on the reactant side, but no mapped atoms on the 
product side.

If your reaction is aiming to hydrolyse non-cyclic boronic esters (and return 
the alcohols), then you should map the oxygen atom on the product side as well 
- something like:

  reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]"

If, instead, you are interested in the virtual reaction that removes boronates 
from aryl R-groups (perhaps to calculate R-group fingerprints, etc) - then you 
should map the aryl carbon on both sides instead:

  reaction_smarts = "[c:1][B;R0](O)O>>[*:1]"

In either case you probably want to deduplicate products (the boronic acids and 
esters will match the pattern twice).

Kind regards

James

From: Katrina Lexa mailto:kl...@umich.edu>>
Sent: 21 August 2023 06:03
To: RDKit Discuss 
mailto:rdkit-discuss@lists.sourceforge.net>>
Subject: [Rdkit-discuss] rdDeprotect & DeprotectData

Hi All,

I don't know why I'm struggling so much with this, as it seems like it should 
be pretty straight forward. I'm trying to add some additional deprotection 
smirks to a data-cleaning python script and I'm not having success with the new 
reactions actually transforming my reactants to deprotected smiles. I have 
about 10 I'd like to add, so I know I could do it with simple reactions, but 
I'd rather figure out where I'm going wrong here.

My definition of deprotect data:
#deborylation
deprotection_class = "boron"
reaction_smarts =  "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"
abbreviation = "BOO"
full_name = "deboron"
bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts, 
abbreviation, full_name)
assert bdata.isValid()

I tried adding this line:
newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata)

but it seems to make no difference:
try:
#result = rdDeprotect.Deprotect(dep_m,deprotections=[bdata])
result = rdDeprotect.Deprotect(dep_m,newDeprotect)


As an example, this is one of the smiles strings in the smiles file I'm reading 
in I would expect to deprotect"
Cc1cc(B(O)O)ccc1OC(C)C

Maybe I'm just awful at writing SMIRKS?


Thanks for the help here,

Katrina



PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or 
postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list

Re: [Rdkit-discuss] rdDeprotect & DeprotectData

2023-08-21 Thread Katrina Lexa
Hi James,

Thanks for the quick reply!

You're quite right, I'm simply interested in the virtual reaction to remove
the boronates. Thank you for fixing my incorrect mapping. At some point, I
had had the aryl carbon properly specified, but I clearly lost my way with
it along my quest.

Sadly, the reaction_smarts = "[c:1][B;R0](O)O>>[*:1]" still does not remove
any of the boronates from my input smiles, but it sounds like everything
else about the specification of the reaction is correct, so I'll get there
at some point with the right reaction_smarts.

Thanks again,

Katrina

On Mon, Aug 21, 2023 at 3:26 AM James Davidson 
wrote:

> Hi Katrina,
>
> I'm slightly unsure what "deprotection" you are trying to represent, but I
> think there are a couple of problems with the rsmarts...
>
> reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"
>
> This is looking for an aromatic carbon with one hydrogen AND connected to
> a non-ring boron.  This pattern will never be found!
> Also, you have a mapped atom on the reactant side, but no mapped atoms on
> the product side.
>
> If your reaction is aiming to hydrolyse non-cyclic boronic esters (and
> return the alcohols), then you should map the oxygen atom on the product
> side as well - something like:
>
> reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]"
>
> If, instead, you are interested in the virtual reaction that removes
> boronates from aryl R-groups (perhaps to calculate R-group fingerprints,
> etc) - then you should map the aryl carbon on both sides instead:
>
> reaction_smarts = "[c:1][B;R0](O)O>>[*:1]"
>
> In either case you probably want to deduplicate products (the boronic
> acids and esters will match the pattern twice).
>
> Kind regards
>
> James
> --
> *From:* Katrina Lexa 
> *Sent:* 21 August 2023 06:03
> *To:* RDKit Discuss 
> *Subject:* [Rdkit-discuss] rdDeprotect & DeprotectData
>
> Hi All,
>
> I don't know why I'm struggling so much with this, as it seems like it
> should be pretty straight forward. I'm trying to add some additional
> deprotection smirks to a data-cleaning python script and I'm not having
> success with the new reactions actually transforming my reactants to
> deprotected smiles. I have about 10 I'd like to add, so I know I could do
> it with simple reactions, but I'd rather figure out where I'm going wrong
> here.
>
> My definition of deprotect data:
> #deborylation
> deprotection_class = "boron"
> reaction_smarts =  "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"
> abbreviation = "BOO"
> full_name = "deboron"
> bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts,
> abbreviation, full_name)
> assert bdata.isValid()
>
> I tried adding this line:
> newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata)
>
> but it seems to make no difference:
> try:
> #result =
> rdDeprotect.Deprotect(dep_m,deprotections=[bdata])
> result = rdDeprotect.Deprotect(dep_m,newDeprotect)
>
>
> As an example, this is one of the smiles strings in the smiles file I'm
> reading in I would expect to deprotect"
> Cc1cc(B(O)O)ccc1OC(C)C
>
> Maybe I'm just awful at writing SMIRKS?
>
>
> Thanks for the help here,
>
> Katrina
>
> --
>
> PLEASE READ - This email is confidential and may be privileged. It is
> intended for the named addressee(s) only and access to it by anyone else is
> unauthorised. If you are not an addressee, any disclosure or copying of the
> contents of this email or any action taken (or not taken) in reliance on it
> is unauthorised and may be unlawful. If you have received this email in
> error, please notify the sender or postmas...@vernalis.com. Email is not
> a secure method of communication and the Company cannot accept
> responsibility for the accuracy or completeness of this message or any
> attachment(s). Please check this email for virus infection for which the
> Company accepts no responsibility. If verification of this email is sought
> then please request a hard copy. Unless otherwise stated, any views or
> opinions presented are solely those of the author and do not represent
> those of the Company.
>
> Vernalis (R) Limited (no. 1985479)
> Granta Park, Great Abington
> Cambridge, CB21 6GB, United Kingdom
> Tel: +44 (0)1223 895 555
> --
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss