Re: [Rdkit-discuss] SMARTS Pattern and scaffold

2018-02-05 Thread Paolo Tosco

Dear Colin,

you might specify the number of implicit Hs that you want on the carbons 
of the indazole nucleus, e.g.:


'[#7]1:[#6&h1]:[#6]2:[#6&h1]:[#6&h1]:[#6&h1]:[#6&h1]:[#6]:2:[#7]:1-[#6]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1'

This would rule out substituted indazoles.

HTH, cheers
p.


On 02/05/18 09:26, Colin Bournez wrote:

Hello everyone,

I have trouble finding what I want using smarts pattern :
Let's say I have for example these molecules :

smis 
=('n2cc1c1n2Cc1c1CC','n2cc1c1n2Cc1c(CC)cc(CCl)cc1','n2cc1c(CC)1n2Cc1c1CC','n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC')


ms = [Chem.MolFromSmiles(x) for x in smis]
Chem.Draw.MolsToGridImage(ms)

So, I have this smarts pattern :
patt = 
Chem.MolFromSmarts('[#7]1:[#6]:[#6]2:[#6]:[#6]:[#6]:[#6]:[#6]:2:[#7]:1-[#6]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1')


When I run :
for smi,m in zip(smis,ms):
    print(smi,m.HasSubstructMatch(patt))

I have logically :
n2cc1c1n2Cc1c1CC True
n2cc1c1n2Cc1c(CC)cc(CCl)cc1 True
n2cc1c(CC)1n2Cc1c1CC True
n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC True

My goal is to have :
n2cc1c1n2Cc1c1CC True
n2cc1c1n2Cc1c(CC)cc(CCl)cc1 True
n2cc1c(CC)1n2Cc1c1CC False
n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC False
So precisely, I want to "block" the indazole from any substitutions and 
retrieve only molecules with changes on the phenyl.

Thanks in advance.

Colin Bournez
-- *Colin Bournez* PhD Student, Structural Bioinformatics & 
Chemoinformatics Institut de Chimie Organique et Analytique (ICOA), 
UMR CNRS-Université d'Orléans 7311 Rue de Chartres, 45067 Orléans, 
France T. +33 238 494 577



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SMARTS Pattern and scaffold

2018-02-05 Thread Colin Bournez

Hello everyone,

I have trouble finding what I want using smarts pattern :
Let's say I have for example these molecules :

smis 
=('n2cc1c1n2Cc1c1CC','n2cc1c1n2Cc1c(CC)cc(CCl)cc1','n2cc1c(CC)1n2Cc1c1CC','n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC')


ms = [Chem.MolFromSmiles(x) for x in smis]
Chem.Draw.MolsToGridImage(ms)

So, I have this smarts pattern :
patt = 
Chem.MolFromSmarts('[#7]1:[#6]:[#6]2:[#6]:[#6]:[#6]:[#6]:[#6]:2:[#7]:1-[#6]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1')


When I run :
for smi,m in zip(smis,ms):
print(smi,m.HasSubstructMatch(patt))

I have logically :

n2cc1c1n2Cc1c1CC True
n2cc1c1n2Cc1c(CC)cc(CCl)cc1 True
n2cc1c(CC)1n2Cc1c1CC True
n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC True

My goal is to have :
n2cc1c1n2Cc1c1CC True
n2cc1c1n2Cc1c(CC)cc(CCl)cc1 True
n2cc1c(CC)1n2Cc1c1CC False
n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC False


So precisely, I want to "block" the indazole from any substitutions and 
retrieve only molecules with changes on the phenyl.

Thanks in advance.

Colin Bournez

-- *Colin Bournez* PhD Student, Structural Bioinformatics & 
Chemoinformatics Institut de Chimie Organique et Analytique (ICOA), UMR 
CNRS-Université d'Orléans 7311 Rue de Chartres, 45067 Orléans, France T. 
+33 238 494 577
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss