Re: [Rdkit-discuss] SMARTS pattern
Hi Eduardo, If I'm understanding what you want to do correctly, then you could try extending your SMARTS pattern to include a ring bond to a neighbor from each atom in the ring: *@*~1~*(@*)~*(@*)~*(@*)~*(@*)~*~1@* If you only want the indices of the ring atoms, you can then just pick those out of the match results you get back -greg On Tue, Jun 7, 2022 at 7:23 PM Eduardo Mayo wrote: > Greetings!! > > I hope this email finds you well. > > I need a SMARTS pattern that matches this molecule fragment > [image: image.png] > The first pattern I used was: > [*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1 > > However, it also matches this fragment. This is not the expected behavior > but it agrees with the pattern, so I tried adding the ring size constrain. > [image: image.png] > Now the pattern I am using is this: > [*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1 > > It worked quite well but now it fail to find matches in this molecule > [image: image.png] > > Does anyone know what I am doing wrong?? > > Code: > --- > > m1 = Chem.MolFromSmiles( > "c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1") > m2 = Chem.MolFromSmiles( > "b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21") > m3 = Chem.MolFromSmiles( > "b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4") > > p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1") > for m, expected_value in zip([m1,m2,m3],[1,2,2]): > print(len(m.GetSubstructMatches(p)) == expected_value) > > > p = Chem.MolFromSmarts( > "[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1") > for m, expected_value in zip([m1,m2,m3],[1,2,2]): > print(len(m.GetSubstructMatches(p)) == expected_value) > > All the best, > Eduardo > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SMARTS pattern
The above solution with !r4 doesn't work because for sssr reasons these atoms are considered to be in a 4 membered ring also if the 4 membered ring is "exo" to the central 6 membered one. AFAIK there is no good way to do a general ring size filter in an atom definition using SMARTS. Below is a quite ugly, but working solution def GetSubstructMatches_filtered(mol,pattern): matches = mol.GetSubstructMatches(pattern) filtered_matches = [] for match in matches: if Chem.MolFragmentToSmiles(mol, atomsToUse=match).count("2") == 0: filtered_matches.append(match) return tuple(filtered_matches) m1 = Chem.MolFromSmiles("c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1") m2 = Chem.MolFromSmiles("b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21") m3 = Chem.MolFromSmiles("b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4") p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1") for m, expected_value in zip([m1,m2,m3],[1,2,2]): print(len(GetSubstructMatches_filtered(m,p)) == expected_value) how does it work? the function GetSubstructMatches_filtered checks if there is more than one ring in the substructure (by converting to substruct to SMILES using atom indices from the GetSubstructMatches result and searching for "2" in the string) and rejects it if so. wim On Tue, Jun 7, 2022 at 8:52 PM Geoffrey Hutchison wrote: > Nevermind, x3 won't exclude the fused 4-atom rings from your first > example. I'll let you know if I think of some other way. :-) > > > I think you'd want something like this, perhaps - to exclude atoms in ring > size 4? > > [*;R2!r4]~1~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~1 > > I also don't know if you're trying to ensure that each of the atoms are > aromatic, in which case, you'd want something like: > > [a;R2!r4]~1~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~1 > > Hope that helps, > -Geoff > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SMARTS pattern
> Nevermind, x3 won't exclude the fused 4-atom rings from your first example. > I'll let you know if I think of some other way. :-) I think you'd want something like this, perhaps - to exclude atoms in ring size 4? [*;R2!r4]~1~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~1 I also don't know if you're trying to ensure that each of the atoms are aromatic, in which case, you'd want something like: [a;R2!r4]~1~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~1 Hope that helps, -Geoff___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SMARTS pattern
On Tue, Jun 7, 2022 at 1:39 PM Ivan Tubert-Brohman < ivan.tubert-broh...@schrodinger.com> wrote: > Perhaps using x3 instead (means "number of ring bonds") would work for > your purposes? > Nevermind, x3 won't exclude the fused 4-atom rings from your first example. I'll let you know if I think of some other way. :-) ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SMARTS pattern
Hi Eduardo, I believe the problem is that r6 means "in *smallest* SSSR ring of size ", where "smallest" in this context means that, for example, for an atom at the ring fusion between a 5-member ring and a 6-member ring, r5 would match that atom but r6 wouldn't. Perhaps using x3 instead (means "number of ring bonds") would work for your purposes? Hope this helps, Ivan On Tue, Jun 7, 2022 at 1:22 PM Eduardo Mayo wrote: > Greetings!! > > I hope this email finds you well. > > I need a SMARTS pattern that matches this molecule fragment > [image: image.png] > The first pattern I used was: > [*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1 > > However, it also matches this fragment. This is not the expected behavior > but it agrees with the pattern, so I tried adding the ring size constrain. > [image: image.png] > Now the pattern I am using is this: > [*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1 > > It worked quite well but now it fail to find matches in this molecule > [image: image.png] > > Does anyone know what I am doing wrong?? > > Code: > --- > > m1 = Chem.MolFromSmiles( > "c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1") > m2 = Chem.MolFromSmiles( > "b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21") > m3 = Chem.MolFromSmiles( > "b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4") > > p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1") > for m, expected_value in zip([m1,m2,m3],[1,2,2]): > print(len(m.GetSubstructMatches(p)) == expected_value) > > > p = Chem.MolFromSmarts( > "[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1") > for m, expected_value in zip([m1,m2,m3],[1,2,2]): > print(len(m.GetSubstructMatches(p)) == expected_value) > > All the best, > Eduardo > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Problem with Mol type in PostgreSQL and asyncpg
Hi, I want to insert a compound to table Substance. The table has id, smiles(varchar) and structure(Mol). I used Mol class from https://github.com/rvianello/razi repository. When I try to execute SQL query using asyncpg: insert into substance (smiles, structure) (select r.smiles, r.structure from unnest($1::substance[]) as r) I have an error: asyncpg.exceptions._base.UnsupportedClientFeatureError: cannot decode type "public"."substance": text encoding of composite types is not supported Do you have any suggestions on how to fix it? I have to use asyncpg Thank you, Kate ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] SMARTS pattern
Greetings!! I hope this email finds you well. I need a SMARTS pattern that matches this molecule fragment [image: image.png] The first pattern I used was: [*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1 However, it also matches this fragment. This is not the expected behavior but it agrees with the pattern, so I tried adding the ring size constrain. [image: image.png] Now the pattern I am using is this: [*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1 It worked quite well but now it fail to find matches in this molecule [image: image.png] Does anyone know what I am doing wrong?? Code: --- m1 = Chem.MolFromSmiles( "c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1") m2 = Chem.MolFromSmiles( "b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21") m3 = Chem.MolFromSmiles( "b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4") p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1") for m, expected_value in zip([m1,m2,m3],[1,2,2]): print(len(m.GetSubstructMatches(p)) == expected_value) p = Chem.MolFromSmarts( "[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1") for m, expected_value in zip([m1,m2,m3],[1,2,2]): print(len(m.GetSubstructMatches(p)) == expected_value) All the best, Eduardo ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss