Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Greg Landrum
Hi Eduardo,

If I'm understanding what you want to do correctly, then you could try
extending your SMARTS pattern to include a ring bond to a neighbor from
each atom in the ring:
*@*~1~*(@*)~*(@*)~*(@*)~*(@*)~*~1@*

If you only want the indices of the ring atoms, you can then just pick
those out of the match results you get back

-greg


On Tue, Jun 7, 2022 at 7:23 PM Eduardo Mayo 
wrote:

> Greetings!!
>
> I hope this email finds you well.
>
> I need a SMARTS pattern that matches this molecule fragment
> [image: image.png]
> The first pattern I used was:
> [*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1
>
> However, it also matches this fragment. This is not the expected behavior
> but it agrees with the pattern, so I tried adding the ring size constrain.
> [image: image.png]
> Now the pattern I am using is this:
> [*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1
>
> It worked quite well but now it fail to find matches in this molecule
> [image: image.png]
>
> Does anyone know what I am doing wrong??
>
> Code:
> ---
>
> m1 = Chem.MolFromSmiles(
> "c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1")
> m2 = Chem.MolFromSmiles(
> "b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21")
> m3 = Chem.MolFromSmiles(
> "b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4")
>
> p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1")
> for m, expected_value in zip([m1,m2,m3],[1,2,2]):
> print(len(m.GetSubstructMatches(p)) == expected_value)
>
>
> p = Chem.MolFromSmarts(
> "[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1")
> for m, expected_value in zip([m1,m2,m3],[1,2,2]):
> print(len(m.GetSubstructMatches(p)) == expected_value)
>
> All the best,
> Eduardo
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Wim Dehaen
The above solution with !r4 doesn't work because for sssr reasons these
atoms are considered to be in a 4 membered ring also if the 4 membered ring
is "exo" to the central 6 membered one. AFAIK there is no good way to do a
general ring size filter in an atom definition using SMARTS. Below is a
quite ugly, but working solution

def GetSubstructMatches_filtered(mol,pattern):
matches = mol.GetSubstructMatches(pattern)
filtered_matches = []
for match in matches:
if Chem.MolFragmentToSmiles(mol, atomsToUse=match).count("2") == 0:
filtered_matches.append(match)
return tuple(filtered_matches)

m1 =
Chem.MolFromSmiles("c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1")
m2 =
Chem.MolFromSmiles("b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21")
m3 =
Chem.MolFromSmiles("b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4")

p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1")
for m, expected_value in zip([m1,m2,m3],[1,2,2]):
print(len(GetSubstructMatches_filtered(m,p)) == expected_value)



how does it work? the function GetSubstructMatches_filtered checks if there
is more than one ring in the substructure (by converting to substruct to
SMILES using atom indices from the GetSubstructMatches result and searching
for "2" in the string) and rejects it if so.
wim




On Tue, Jun 7, 2022 at 8:52 PM Geoffrey Hutchison 
wrote:

> Nevermind, x3 won't exclude the fused 4-atom rings from your first
> example. I'll let you know if I think of some other way. :-)
>
>
> I think you'd want something like this, perhaps - to exclude atoms in ring
> size 4?
>
> [*;R2!r4]~1~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~1
>
> I also don't know if you're trying to ensure that each of the atoms are
> aromatic, in which case, you'd want something like:
>
> [a;R2!r4]~1~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~1
>
> Hope that helps,
> -Geoff
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Geoffrey Hutchison
> Nevermind, x3 won't exclude the fused 4-atom rings from your first example. 
> I'll let you know if I think of some other way. :-)

I think you'd want something like this, perhaps - to exclude atoms in ring size 
4?

[*;R2!r4]~1~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~1

I also don't know if you're trying to ensure that each of the atoms are 
aromatic, in which case, you'd want something like:

[a;R2!r4]~1~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~1

Hope that helps,
-Geoff___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Ivan Tubert-Brohman
On Tue, Jun 7, 2022 at 1:39 PM Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> Perhaps using x3 instead (means "number of ring bonds") would work for
> your purposes?
>

Nevermind, x3 won't exclude the fused 4-atom rings from your first example.
I'll let you know if I think of some other way. :-)
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Ivan Tubert-Brohman
Hi Eduardo,

I believe the problem is that r6 means "in *smallest* SSSR ring of size
", where "smallest" in this context means that, for example, for an atom
at the ring fusion between a 5-member ring and a 6-member ring, r5 would
match that atom but r6 wouldn't.

Perhaps using x3 instead (means "number of ring bonds") would work for your
purposes?

Hope this helps,
Ivan


On Tue, Jun 7, 2022 at 1:22 PM Eduardo Mayo 
wrote:

> Greetings!!
>
> I hope this email finds you well.
>
> I need a SMARTS pattern that matches this molecule fragment
> [image: image.png]
> The first pattern I used was:
> [*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1
>
> However, it also matches this fragment. This is not the expected behavior
> but it agrees with the pattern, so I tried adding the ring size constrain.
> [image: image.png]
> Now the pattern I am using is this:
> [*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1
>
> It worked quite well but now it fail to find matches in this molecule
> [image: image.png]
>
> Does anyone know what I am doing wrong??
>
> Code:
> ---
>
> m1 = Chem.MolFromSmiles(
> "c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1")
> m2 = Chem.MolFromSmiles(
> "b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21")
> m3 = Chem.MolFromSmiles(
> "b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4")
>
> p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1")
> for m, expected_value in zip([m1,m2,m3],[1,2,2]):
> print(len(m.GetSubstructMatches(p)) == expected_value)
>
>
> p = Chem.MolFromSmarts(
> "[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1")
> for m, expected_value in zip([m1,m2,m3],[1,2,2]):
> print(len(m.GetSubstructMatches(p)) == expected_value)
>
> All the best,
> Eduardo
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problem with Mol type in PostgreSQL and asyncpg

2022-06-07 Thread Katarzyna Rzęsikowska
Hi,
I want to insert a compound to table Substance. The table has id,
smiles(varchar) and structure(Mol). I used Mol class from
https://github.com/rvianello/razi repository.

When I try to execute SQL query using asyncpg:

  insert into substance
(smiles, structure)
(select  r.smiles, r.structure from unnest($1::substance[]) as r)

I have an error:

asyncpg.exceptions._base.UnsupportedClientFeatureError: cannot decode
type "public"."substance": text encoding of composite types is not
supported

Do you have any suggestions on how to fix it? I have to use asyncpg
Thank you,
Kate
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Eduardo Mayo
Greetings!!

I hope this email finds you well.

I need a SMARTS pattern that matches this molecule fragment
[image: image.png]
The first pattern I used was:
[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1

However, it also matches this fragment. This is not the expected behavior
but it agrees with the pattern, so I tried adding the ring size constrain.
[image: image.png]
Now the pattern I am using is this:
[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1

It worked quite well but now it fail to find matches in this molecule
[image: image.png]

Does anyone know what I am doing wrong??

Code:
---

m1 = Chem.MolFromSmiles(
"c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1")
m2 = Chem.MolFromSmiles(
"b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21")
m3 = Chem.MolFromSmiles(
"b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4")

p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1")
for m, expected_value in zip([m1,m2,m3],[1,2,2]):
print(len(m.GetSubstructMatches(p)) == expected_value)


p = Chem.MolFromSmarts(
"[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1")
for m, expected_value in zip([m1,m2,m3],[1,2,2]):
print(len(m.GetSubstructMatches(p)) == expected_value)

All the best,
Eduardo
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss