Re: [Rdkit-discuss] rearomatize only benzene rigns after kekulize + clearAromaticFlags

2016-09-22 Thread Guillaume GODIN
Ok guys, I came up with a possible solution for the N,C 6 rings aromatic 
rearomatisation after kekulize.


I still need to find a ways to do it also for guanidinium salts.

?

def TestL_n(L,n,aro):
suppl = 
Chem.SDMolSupplier('/Users/mbp/Downloads/molecules-20-18279-s001/Compounds List 
for Heat-of-Combustion Calculations.sdf',removeHs=False)
i=0
cp=0
v=zeros(len(L))
for mol in suppl:
if aro:
s = Chem.MolToSmiles(mol)
m = rearomatization(s)
else:
m = mol
i=i+1
for j in range(0, len(L)):
if j==n-1:
try:
c= Occ(L[j],m)
v[j]+=c
if c>0:
cp+=1
print Chem.MolToSmiles(Chem.RemoveHs(m))
except:
print "error"
return v[n-1],cp


# r6 (C or N) rearomatization: C/N & guanidinium
def keep6aro(m):  #[N;v3X3,v4X4+][CX3](=[N;v3X2,v4X3+])[N;v3X3,v4X4+]
# greg version
r6 = 
Chem.MolFromSmarts('[#6,#7;a]1[#6,#7;a][#6,#7;a][#6,#7;a][#6,#7;a][#6,#7;a]1') 
# ring of 6 of C or N only!
atomkeep = m.GetSubstructMatches(r6)
ri = m.GetRingInfo()
BondRing = ri.BondRings()
bondkeep=[]
for bondring in BondRing:
if len(bondring)==6 and isRingAromatic(m,bondring):
bondkeep.append(bondring)
return atomkeep, bondkeep

def Aromatics6ring2(m,atomkeep,bondkeep):  
#[N;v3X3,v4X4+][CX3](=[N;v3X2,v4X3+])[N;v3X3,v4X4+]
# greg version
for match in atomkeep:
for mi in match:
m.GetAtomWithIdx(mi).SetIsAromatic(True)
for bondring in bondkeep:
for bondid in bondring:
mb = m.GetBondWithIdx(bondid)
mb.SetBondType(Chem.rdchem.BondType.AROMATIC)
mb.SetIsAromatic(True)
return m

def isRingAromatic(mol,BondRing):
for id in BondRing:
if not mol.GetBondWithIdx(id).GetIsAromatic():
return False
return True

def rearomatization(s):
mol = Chem.MolFromSmiles(s)
atomkeep, bondkeep= keep6aro(mol)
Chem.rdmolops.Kekulize(mol,clearAromaticFlags=True)
mol=Aromatics6ring2(mol,atomkeep,bondkeep)
return mol


Dr. Guillaume GODIN
Principal Scientist
Chemoinformatic & Datamining
Innovation
CORPORATE R DIVISION
DIRECT LINE +41 (0)22 780 3645
MOBILE  +41 (0)79 536 1039
Firmenich SA
RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8


De : Greg Landrum 
Envoyé : jeudi 22 septembre 2016 10:22
À : Guillaume GODIN
Cc : RDKit Discuss
Objet : Re: [Rdkit-discuss] rearomatize only benzene rigns after kekulize + 
clearAromaticFlags



On Wed, Sep 21, 2016 at 4:31 PM, Guillaume GODIN 
> wrote:

After testing the code, It works perfectly, thanks!

Well, there's at least that. ;-)


Unfortunatly, I discovered that it's still not compatible with the aromaticity 
method used in the article i mention in another post from Rudolf Naef.

Before going further with this I have a question for you (note that I still 
haven't had time to read the paper in detail): I understand that in order to 
exactly reproduce the results from that paper you do need to reproduce the 
aromaticity model used. However, if you were to borrow the methods and data 
from the paper, you could theoretically build your own models based on RDKit 
aromaticity. This would likely be more efficient at runtime than re-perceiving 
aromaticity.


I need to keep aromaticity of all 6 rings (having C or N which is possible 
using your function), but also keep info of aromaticity of fused 6 rings (aka. 
naphthalene, ...) + convert/keep guanidium moieties aromatic too.

So, I would be more interesting to fine a fast process to revoke aromaticity on 
rings that are not 6 members rings only, which should preserve all 6 rings + 
fused aromatic rings and also set guanidium salt as aromatic.

 "Revoking" aromaticity is tricky because you really need to also kekulize the 
rings that you remove aromaticity from.
I think you're going to be better off just describing the features that are 
aromatic and applying the method I described in the previous message.

The SMARTS I sent to you should certainly work for fused rings like naphthalene 
and could be adapted to support heteroatoms. Guanidinium is a different problem 
though... the RDKit does not tolerate aromatic bonds/atoms that aren't in 
rings. What exactly do you want to do there?

-greg
**  
DISCLAIMER  
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their 

Re: [Rdkit-discuss] (no subject)

2016-09-22 Thread Markus Metz
Hello:
Greg and Curt your comments are very much appreciated.
Thanks for getting back to me!
Best,
Markus

On Wed, Sep 21, 2016 at 8:31 PM, Greg Landrum 
wrote:

> Hi Markus,
>
> Curt's instincts are dead on: the problem here is the rings.
>
> I'll show the fix and then explain what's going on. You just need to add
> one line to your code:
>
> core = "[a]12[a][a][a][a][a]1[a][a][a]2"
> pattern = Chem.MolFromSmarts(core)
> Chem.GetSSSR(pattern)
> AllChem.Compute2DCoords(pattern)
>
> when I do this, I get the following depiction for "c1(ocn2)c21":
>
> (The highlighting is due to the substructure match that's done during the
> generation of coordinates).
>
> So why is this necessary?
> The code that generates 2D coordinates uses information about the size of
> ring systems in the molecule as part of the coordinate generation. If no
> ring information is present (which is true of molecules generated from
> SMARTS since they are not fully sanitized on construction) then the code
> calls FastFindRings(). This function is perfectly capable of identifying
> all ring atoms and bonds, but it isn't very good at getting ring sizes
> correct for fused systems (it finds rings, but not the smallest rings). The
> consequences are the badly generated coordinates for fused ring systems
> that you were seeing.
>
> I think the current behavior of the code "isn't really ideal": the
> coordinate generation code should call the SSSR algorithm in these cases so
> that it can generate better coordinates. I'll take a look at the code and
> think about changing it.
>
> As an aside: if you're puzzled by the behavior of AllChem.
> GenerateDepictionMatching2DStructure() you can always just take a look at
> the drawing of the query molecule itself. It's not always the most
> informative depiction when it comes to what the atom and bond queries are,
> but you at least will see the coordinates.
>
> A second aside: the molecule depictions in that notebook indicate that you
> are stuck using the fallback drawing code, which creates fairly ugly
> pictures. You can get better drawings by either installing cairo and
> pycairo (in which case the code should automatically use those) or telling
> the drawing code to use SVG for the rendering:
>
> from rdkit.Chem.Draw import IPythonConsole
> IPythonConsole.ipython_useSVG=True
>
> It really does make the drawings a lot better.
>
> I hope this helps,
> -greg
>
>
>
>
>
>
> On Wed, Sep 21, 2016 at 8:47 PM, Markus Metz  wrote:
>
>> Hello all:
>>
>> I am trying to perform a 2D alignment of molecules by using a pattern for
>> which I am using Compute2DCoords.
>>
>> If I use a smarts string matching napthalene the 2D depiction is as one
>> would expect.
>> However, if I am switching to a 5,6 aromatic smarts pattern the matched
>> benzoxazol the 2D structure looks rather unusual.
>>
>> Is there a way to match the 5,6 with the 6,6 pattern behavior?
>>
>> Any hint is very much appreciated,
>>
>> Markus
>>
>> P.S. a work book is attached.
>>
>> 
>> --
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] error while calculating rms and shape tanimoto

2016-09-22 Thread Amit singh
Hi
>>Files 1.pdb and 2.pdb do not contain CONECT records (so missing bond
orders?).

 I removed  CONECT records from Files a.pdb and b.pdb. still code is
running fine

>>File 1.pdb contains an atom with name BR43. Maybe the PDB parser can't
parse that (seems valid to me, FWTW).

 Also tried to run by removing BR43 from the file. Still same error.


Amit


On Thu, Sep 22, 2016 at 4:26 PM, Paul Emsley 
wrote:

>
>
> File 1.pdb contains an atom with name BR43. Maybe the PDB parser can't
> parse that (seems valid to me, FWTW).
>
> Paul
>
>
>
> On 22/09/16 11:15, Amit singh wrote:
>
> Hi
>
> Files a.pdb and b.pdb are from RDKit test data (working fine)
>
> Files 1.pdb and 2.pdb (other than test data, which are giving error)
>
> On Thu, Sep 22, 2016 at 3:03 PM, Greg Landrum 
> wrote:
>
>> HI Amit,
>>
>>
>> On Thu, Sep 22, 2016 at 9:23 AM, Amit singh 
>> wrote:
>>
>>>
>>> I am a new entry in this discussion forum and also for RDKit
>>>
>>
>> Welcome!
>>
>>
>>> I am trying to calculate shape tanimoto  and rms between two molecules
>>> (PDB files) from 3D functionality of RDKit.
>>> Code is working fine for the pdb files given in test data.
>>> But gives error whenever I uses other pdb files
>>>
>>> 
>>> >>> rms = rdMolAlign.AlignMol(mol1, mol2)
>>> Traceback (most recent call last):
>>>   File "", line 1, in 
>>> RuntimeError: std::exception
>>> ---
>>> It looks like there is a problem in input files, but help required
>>>
>>
>> In order to be able to answer the question, we need a bit more
>> information. Can you please share what files you loaded mol1 and mol2 from
>> so that we can reproduce the problem?
>>
>> -greg
>>
>>
>
>
>
>
>
> --
>
>
>
> ___
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] error while calculating rms and shape tanimoto

2016-09-22 Thread Paul Emsley
Files 1.pdb and 2.pdb do not contain CONECT records (so missing bond 
orders?).


File 1.pdb contains an atom with name BR43. Maybe the PDB parser can't 
parse that (seems valid to me, FWTW).


Paul


On 22/09/16 11:15, Amit singh wrote:

Hi

Files a.pdb and b.pdb are from RDKit test data (working fine)

Files 1.pdb and 2.pdb (other than test data, which are giving error)

On Thu, Sep 22, 2016 at 3:03 PM, Greg Landrum > wrote:


HI Amit,


On Thu, Sep 22, 2016 at 9:23 AM, Amit singh > wrote:


I am a new entry in this discussion forum and also for RDKit


Welcome!

I am trying to calculate shape tanimoto and rms between two
molecules (PDB files) from 3D functionality of RDKit.
Code is working fine for the pdb files given in test data.
But gives error whenever I uses other pdb files


>>> rms = rdMolAlign.AlignMol(mol1, mol2)
Traceback (most recent call last):
  File "", line 1, in 
RuntimeError: std::exception
---
It looks like there is a problem in input files, but help required


In order to be able to answer the question, we need a bit more
information. Can you please share what files you loaded mol1 and
mol2 from so that we can reproduce the problem?

-greg




--

Dr. Amit Kumar
Scientist B
National Institute of Cancer Prevention and Research
(Formly Institute of Cytology and Preventive Oncology)
 I-7, Sector - 39, Noida - 201301 Uttar Pradesh



--


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] error while calculating rms and shape tanimoto

2016-09-22 Thread Amit singh
Hi

Files a.pdb and b.pdb are from RDKit test data (working fine)

Files 1.pdb and 2.pdb (other than test data, which are giving error)

On Thu, Sep 22, 2016 at 3:03 PM, Greg Landrum 
wrote:

> HI Amit,
>
>
> On Thu, Sep 22, 2016 at 9:23 AM, Amit singh  wrote:
>
>>
>> I am a new entry in this discussion forum and also for RDKit
>>
>
> Welcome!
>
>
>> I am trying to calculate shape tanimoto  and rms between two molecules
>> (PDB files) from 3D functionality of RDKit.
>> Code is working fine for the pdb files given in test data.
>> But gives error whenever I uses other pdb files
>>
>> 
>> >>> rms = rdMolAlign.AlignMol(mol1, mol2)
>> Traceback (most recent call last):
>>   File "", line 1, in 
>> RuntimeError: std::exception
>> ---
>> It looks like there is a problem in input files, but help required
>>
>
> In order to be able to answer the question, we need a bit more
> information. Can you please share what files you loaded mol1 and mol2 from
> so that we can reproduce the problem?
>
> -greg
>
>


-- 

Dr. Amit Kumar
Scientist B
National Institute of Cancer Prevention and Research
(Formly Institute of Cytology and Preventive Oncology)
 I-7, Sector - 39, Noida - 201301 Uttar Pradesh


b.pdb
Description: application/aportisdoc


a.pdb
Description: application/aportisdoc


2.pdb
Description: application/aportisdoc


1.pdb
Description: application/aportisdoc
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] error while calculating rms and shape tanimoto

2016-09-22 Thread Greg Landrum
HI Amit,


On Thu, Sep 22, 2016 at 9:23 AM, Amit singh  wrote:

>
> I am a new entry in this discussion forum and also for RDKit
>

Welcome!


> I am trying to calculate shape tanimoto  and rms between two molecules
> (PDB files) from 3D functionality of RDKit.
> Code is working fine for the pdb files given in test data.
> But gives error whenever I uses other pdb files
>
> 
> >>> rms = rdMolAlign.AlignMol(mol1, mol2)
> Traceback (most recent call last):
>   File "", line 1, in 
> RuntimeError: std::exception
> ---
> It looks like there is a problem in input files, but help required
>

In order to be able to answer the question, we need a bit more information.
Can you please share what files you loaded mol1 and mol2 from so that we
can reproduce the problem?

-greg
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rearomatize only benzene rigns after kekulize + clearAromaticFlags

2016-09-22 Thread Greg Landrum
On Wed, Sep 21, 2016 at 4:31 PM, Guillaume GODIN <
guillaume.go...@firmenich.com> wrote:

> After testing the code, It works perfectly, thanks!
>
Well, there's at least that. ;-)


> Unfortunatly, I discovered that it's still not compatible with the
> aromaticity method used in the article i mention in another post from
> Rudolf Naef.
>
Before going further with this I have a question for you (note that I still
haven't had time to read the paper in detail): I understand that in order
to exactly reproduce the results from that paper you do need to reproduce
the aromaticity model used. However, if you were to borrow the methods and
data from the paper, you could theoretically build your own models based on
RDKit aromaticity. This would likely be more efficient at runtime than
re-perceiving aromaticity.


> I need to keep aromaticity of all 6 rings (having C or N which is
> possible using your function), but also keep info of aromaticity of fused 6
> rings (aka. naphthalene, ...) + convert/keep guanidium moieties aromatic
> too.
>
> So, I would be more interesting to fine a fast process to
> revoke aromaticity on rings that are not 6 members rings only, which should
> preserve all 6 rings + fused aromatic rings and also set guanidium salt as
> aromatic.
>
 "Revoking" aromaticity is tricky because you really need to also kekulize
the rings that you remove aromaticity from.
I think you're going to be better off just describing the features that are
aromatic and applying the method I described in the previous message.

The SMARTS I sent to you should certainly work for fused rings like
naphthalene and could be adapted to support heteroatoms. Guanidinium is a
different problem though... the RDKit does not tolerate aromatic
bonds/atoms that aren't in rings. What exactly do you want to do there?

-greg
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] error while calculating rms and shape tanimoto

2016-09-22 Thread Amit singh
Dear All

I am a new entry in this discussion forum and also for RDKit

I am trying to calculate shape tanimoto  and rms between two molecules (PDB
files) from 3D functionality of RDKit.
Code is working fine for the pdb files given in test data.
But gives error whenever I uses other pdb files


>>> rms = rdMolAlign.AlignMol(mol1, mol2)
Traceback (most recent call last):
  File "", line 1, in 
RuntimeError: std::exception
---
It looks like there is a problem in input files, but help required

Thanks

Amit
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss