Many thanks for your responses. Recursive smiles become cumbersome to write when they are on fused aromatics. I was hopping I could write a simple function that would cover all regio-isomers from any aromatics... Looks more complicated than I thought! Thanks, Alexis
On Wed, 29 Jan 2020 at 13:49, Greg Landrum <greg.land...@gmail.com> wrote: > I agree with Chris here: I don't think you can do this with dot > disconnection and that you need to use recursive SMARTS as Chris describes. > > A tweak to Chris' answer: if you want the query to look for regioisomers > around a phenyl ring you unfortunately need to actually specify the whole > ring in order to avoid having the query match a Cl and a Br in different > fused rings in some cases. This gets a bit unwieldy: > Cl[c;$(c1(Cl)c(Br)cccc1),$(c1(Cl)cc(Br)ccc1),$(c1(Cl)ccc(Br)cc1)] > > Here's a demo: > > In [12]: p = > Chem.MolFromSmarts('Cl[c;$(c1(Cl)c(Br)cccc1),$(c1(Cl)cc(Br)ccc1),$(c1(Cl)ccc(Br)cc1)]') > > > In [13]: Chem.MolFromSmiles('Clc1c(Br)cccc1').HasSubstructMatch(p) > > Out[13]: True > > In [14]: Chem.MolFromSmiles('Clc1cc(Br)ccc1').HasSubstructMatch(p) > > Out[14]: True > > In [15]: Chem.MolFromSmiles('Clc1ccc(Br)cc1').HasSubstructMatch(p) > > Out[15]: True > > In [16]: > Chem.MolFromSmiles('ClC1=CC=CC2=C1C(Br)=CC=C2').HasSubstructMatch(p) > > Out[16]: False > > > It's harder if you want to actually retrieve the atoms that match the > query directly using the SMARTS. Hopefully you're just looking for the > match. :-) > > -greg > > > On Wed, Jan 29, 2020 at 1:39 PM Chris Earnshaw <cgearns...@gmail.com> > wrote: > >> Hi >> >> Dot-disconnected fragments are not going to work for this, as you >> describe. You need to use recursive SMARTS (see >> https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html section >> 4.4). Something like: >> Clc[$(cBr);$(ccBr);$(cccBr)] >> should (I hope!) be a reasonable starting point. >> >> Chris Earnshaw >> >> On Wed, 29 Jan 2020 at 11:12, Alexis Parenty < >> alexis.parenty.h...@gmail.com> wrote: >> >>> Hi everyone, >>> >>> >>> >>> Is there a way to get a substructure match of regioisomers using a >>> smarts by separating the fragments with “.”: >>> >>> >>> >>> The following approach works but is too permissive since it will also >>> match structures with a bromide or a chloride linked to a aliphatic >>> carbon... >>> >>> >>> [image: image.png] >>> >>> >>> >>> mol_smiles_structure = Chem.MolFromSmiles("Clc1cc(Br)ccc1") >>> >>> mol_smarts_fragment = Chem.MolFromSmarts("c1ccccc1.[Cl].[Br]") >>> >>> print(mol_smiles_structure.HasSubstructMatch(mol_smarts_fragment)) >>> >>> ð True >>> >>> >>> >>> >>> >>> mol_smiles_structure = Chem.MolFromSmiles("ClCc1cc(Br)ccc1") >>> >>> print(mol_smiles_structure.HasSubstructMatch(mol_smarts_fragment)) >>> >>> ð True >>> >>> >>> >>> If I specify that the Br and the Cl need to be attached to an aromatic >>> to make it less permissive, it no longer match because I cannot specify >>> that the two general aromatics [a] also belong to the benzene ring... (i,e. >>> it tries to look for 8 aromatics instead of six...) >>> >>> mol_smiles_structure = Chem.MolFromSmiles("Clc1cc(Br)ccc1") >>> >>> mol_smarts_fragment = Chem.MolFromSmarts("c1ccccc1.[a][Cl].[a][Br]") >>> >>> print(mol_smiles_structure.HasSubstructMatch(mol_smarts_fragment)) >>> >>> ð False >>> >>> >>> >>> Thanks! >>> >>> Alexis >>> >>> >>> >>> >>> _______________________________________________ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss