Re: [Rdkit-discuss] (no subject)

2016-09-21 Thread Greg Landrum
Hi Markus,

Curt's instincts are dead on: the problem here is the rings.

I'll show the fix and then explain what's going on. You just need to add
one line to your code:

core = "[a]12[a][a][a][a][a]1[a][a][a]2"
pattern = Chem.MolFromSmarts(core)
Chem.GetSSSR(pattern)
AllChem.Compute2DCoords(pattern)

when I do this, I get the following depiction for "c1(ocn2)c21":

(The highlighting is due to the substructure match that's done during the
generation of coordinates).

So why is this necessary?
The code that generates 2D coordinates uses information about the size of
ring systems in the molecule as part of the coordinate generation. If no
ring information is present (which is true of molecules generated from
SMARTS since they are not fully sanitized on construction) then the code
calls FastFindRings(). This function is perfectly capable of identifying
all ring atoms and bonds, but it isn't very good at getting ring sizes
correct for fused systems (it finds rings, but not the smallest rings). The
consequences are the badly generated coordinates for fused ring systems
that you were seeing.

I think the current behavior of the code "isn't really ideal": the
coordinate generation code should call the SSSR algorithm in these cases so
that it can generate better coordinates. I'll take a look at the code and
think about changing it.

As an aside: if you're puzzled by the behavior of
AllChem.GenerateDepictionMatching2DStructure() you can always just take a
look at the drawing of the query molecule itself. It's not always the most
informative depiction when it comes to what the atom and bond queries are,
but you at least will see the coordinates.

A second aside: the molecule depictions in that notebook indicate that you
are stuck using the fallback drawing code, which creates fairly ugly
pictures. You can get better drawings by either installing cairo and
pycairo (in which case the code should automatically use those) or telling
the drawing code to use SVG for the rendering:

from rdkit.Chem.Draw import IPythonConsole
IPythonConsole.ipython_useSVG=True

It really does make the drawings a lot better.

I hope this helps,
-greg






On Wed, Sep 21, 2016 at 8:47 PM, Markus Metz  wrote:

> Hello all:
>
> I am trying to perform a 2D alignment of molecules by using a pattern for
> which I am using Compute2DCoords.
>
> If I use a smarts string matching napthalene the 2D depiction is as one
> would expect.
> However, if I am switching to a 5,6 aromatic smarts pattern the matched
> benzoxazol the 2D structure looks rather unusual.
>
> Is there a way to match the 5,6 with the 6,6 pattern behavior?
>
> Any hint is very much appreciated,
>
> Markus
>
> P.S. a work book is attached.
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] (no subject)

2016-09-21 Thread Curt Fischer
Hi Markus,

I suspect the problem is that your SMARTS query is not as specific as you
might think.  For example, RDKit does not understand how many rings there
are in your SMARTS query.  Each [a] could be an atom arbitrarily connected
to many other rings that wouldn't be a part of the substructure match.
Thus, RDKit cannot generate a meaningful set of 2D coordinates for your
SMARTS patterns, and defaults somehow to the unhelpful representation you
reported.

Compare:

# define two molecules one from smiles, one from smarts
naphthalene = Chem.MolFromSmiles('c12c12')
naphthalene_smarts = Chem.MolFromSmarts('c12c12')

# define a query that hits atoms that are in two rings
in_two_rings = Chem.MolFromSmarts('[R2]')

# find atoms in our molecules that are in two rings
naphthalene.GetSubstructMatches(in_two_rings)
naphthalene_smarts.GetSubstructMatches(in_two_rings)  # fails because the
RingInfo object of this molecule could not be initiated

A path forward for you could be setting the RingInfo of your SMARTS query
manually, but I"m not exactly sure how to do that.  Maybe others could
weigh in?  Here's a SMARTS that might be useful: it should hit any molecule
that consists of an aromatic benzene fused to any (aromatic or aliphatic)
five-membered ring:
benzene_with_five_membered_fusion =
Chem.MolFromSmarts('[*r5R1]1[cR2]2[cR1][cR1][cR1][cR1][cR2]2[*r5R1][*r5R1]1')

Curt

On Wed, Sep 21, 2016 at 11:47 AM, Markus Metz  wrote:

> Hello all:
>
> I am trying to perform a 2D alignment of molecules by using a pattern for
> which I am using Compute2DCoords.
>
> If I use a smarts string matching napthalene the 2D depiction is as one
> would expect.
> However, if I am switching to a 5,6 aromatic smarts pattern the matched
> benzoxazol the 2D structure looks rather unusual.
>
> Is there a way to match the 5,6 with the 6,6 pattern behavior?
>
> Any hint is very much appreciated,
>
> Markus
>
> P.S. a work book is attached.
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] 2D alignment question

2016-09-21 Thread Markus Metz
==
Trying to post my message again as there seems to be a problem with my
first attempt.
==

Hello all:

I am trying to perform a 2D alignment of molecules by using a pattern for
which I am using Compute2DCoords.

If I use a smarts string matching napthalene the 2D depiction is as one
would expect.
However, if I am switching to a 5,6 aromatic smarts pattern the matched
benzoxazol the 2D structure looks rather unusual.

Is there a way to match the 5,6 with the 6,6 pattern behavior?

Any hint is very much appreciated,

Markus

P.S. a work book is attached.


2dDepiction.ipynb
Description: Binary data
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] (no subject)

2016-09-21 Thread Markus Metz
Hello all:

I am trying to perform a 2D alignment of molecules by using a pattern for
which I am using Compute2DCoords.

If I use a smarts string matching napthalene the 2D depiction is as one
would expect.
However, if I am switching to a 5,6 aromatic smarts pattern the matched
benzoxazol the 2D structure looks rather unusual.

Is there a way to match the 5,6 with the 6,6 pattern behavior?

Any hint is very much appreciated,

Markus

P.S. a work book is attached.


2dDepiction.ipynb
Description: Binary data
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rearomatize only benzene rigns after kekulize + clearAromaticFlags

2016-09-21 Thread Guillaume GODIN
Thanks Greg,


After testing the code, It works perfectly, thanks!


Unfortunatly, I discovered that it's still not compatible with the aromaticity 
method used in the article i mention in another post from Rudolf Naef.


I need to keep aromaticity of all 6 rings (having C or N which is possible 
using your function), but also keep info of aromaticity of fused 6 rings (aka. 
naphthalene, ...) + convert/keep guanidium moieties aromatic too.


So, I would be more interesting to fine a fast process to revoke aromaticity on 
rings that are not 6 members rings only, which should preserve all 6 rings + 
fused aromatic rings and also set guanidium salt as aromatic.


Would it be possible to do that ?


Best regards,


Guillaume






De : Greg Landrum 
Envoyé : mardi 20 septembre 2016 16:41
À : Guillaume GODIN
Cc : RDKit Discuss
Objet : Re: [Rdkit-discuss] rearomatize only benzene rigns after kekulize + 
clearAromaticFlags

Guillaume,

Here's how to read in a molecule and skip aromaticity perception:
In [12]: m = Chem.MolFromSmiles('c1c1-c1cccnc1-c1cc[nH]c1',sanitize=False)

In [13]: 
Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_SETAROMATICITY)
Out[13]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [14]: Chem.MolToSmiles(m)
Out[14]: 'C1=CC=C(C2=CC=CN=C2C2=CNC=C2)C=C1'

Then you need to find all of the benzene-like rings and set the appropriate 
flags:
In [16]: def set_aroms(m,pattern='[#6]1=[#6][#6]=[#6][#6]=[#6]1'):
...: p = Chem.MolFromSmarts(pattern)
...: matches = m.GetSubstructMatches(p)
...: for match in matches:
...: for mi in match:
...: m.GetAtomWithIdx(mi).SetIsAromatic(True)
...: for bond in p.GetBonds():
...: mb = 
m.GetBondBetweenAtoms(match[bond.GetBeginAtomIdx()],match[bond.GetEndAtomIdx()])
...: assert mb
...: mb.SetBondType(Chem.BondType.AROMATIC)
...: mb.SetIsAromatic(True)
...:
...:

In [17]: set_aroms(m)

In [18]: Chem.MolToSmiles(m)
Out[18]: 'C1=CN=C(C2=CNC=C2)C(c2c2)=C1'


Note that the RDKit does have a simplified aromaticity model available out of 
the box that only recognizes aromaticity in 5- and 6- rings, but this isn't 
quite as simple as what you're looking for.

Best,
-greg


On Tue, Sep 20, 2016 at 6:52 AM, Guillaume GODIN 
mailto:guillaume.go...@firmenich.com>> wrote:

Dear All,


I would like to kekulize the molecule, but only conserve the aromaticity 
knowledge of C6 (benzene like)  type rings


So what to need to do after this command?


Chem.rdmolops.Kekulize(mol, clearAromaticFlags=True)


Should I need to store benzene like location before the Kekulize process and 
restore it after and How to restore it (SetIsAromatic of atoms as true and 
change bondtype )?


best regards,


Guillaume GODIN

**
DISCLAIMER
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their official business.
**

--

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss