Re: [Rdkit-discuss] kekulizing carbazole

2010-11-01 Thread Paul Emsley
On 31/10/10 14:18, Greg Landrum wrote:
 Hi Paul,

 On Sun, Oct 31, 2010 at 12:09 PM, Paul Emsley
 paul.ems...@bioch.ox.ac.uk  wrote:


 I'm running into problems when I try to kekulize carbazole.

 The description I start with is that all the bonds are marked as
 Bond::AROMATIC and I do setIsAromatic(true) on all the atoms (which are
 all non-hydrogens).  The explicitValence() for the N is 3.

 MolOps::Kekulize() fails in that case, Can't kekulize mol.

 If I add single bonds to hydrogens (including a hydrogens on the N) then
 MolOps::Kekulize() works.

 So my question is, how should I adjust the molecule description in the
 first case so that MolOps::Kekulize() works without hydrogens too?
  
 The problem is probably the lack of an explicit H on the nitrogen
 atom. It's easily demonstrated with pyrrole:

 [2]  m=Chem.MolFromSmiles('c1cccn1')
 [15:10:05] Can't kekulize mol

 You can fix this by letting the RDKit know that there's an H on the N atom:
 [3]  m=Chem.MolFromSmiles('c1ccc[nH]1')
 [4]

 Carbazole is the same story:
 [4]  m=Chem.MolFromSmiles('c1ccc2c(c1)[nH]c1c21')
 [5]

 Note that in either case if you provide the structure in its Kekule
 form this doesn't happen, here's the illustration for pyrrole:
 [5]  m=Chem.MolFromSmiles('C1=CC=CN1')
 [6]  Chem.MolToSmiles(m)
 Out[6] 'c1cc[nH]c1'

 There's an argument to be made that the Kekulization code could be
 made more robust with respect to this particular edge case, but to
 this point the effort involved has not seem justified by the payoff:
 most of the time the H is present in the SMILES, so this problem
 doesn't occur.



Hi Greg,

Thanks for your informative and speedy reply.

For the record, I would like to describe how I proceeded in the light of 
your reply.

My starting point to construct an RWMol is an mmCIF restraints file.  As 
well as containing description of the bonds and angles (etc.) this file 
describes the atoms, part of the description of which is the 
type_energy.  Pyrrole and carbazole Ns (for example) have the type 
NR15(energy types are listed in energy_lib.cif [1]) so now when I see 
an atom of that type [2], I add an extra H bonded to the N and 
everything is then peachy.

Thanks again,

Paul.

[1] 
http://www.ccp4.ac.uk/ccp4bin/viewcvs/ccp4/lib/data/monomers/ener_lib.cif
[2] there are other cases that I need to handle






--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] kekulizing carbazole

2010-11-01 Thread Greg Landrum
Hi Paul,

On Mon, Nov 1, 2010 at 3:54 PM, Paul Emsley paul.ems...@bioch.ox.ac.uk wrote:

 For the record, I would like to describe how I proceeded in the light of
 your reply.

 My starting point to construct an RWMol is an mmCIF restraints file.  As
 well as containing description of the bonds and angles (etc.) this file
 describes the atoms, part of the description of which is the
 type_energy.  Pyrrole and carbazole Ns (for example) have the type
 NR15(energy types are listed in energy_lib.cif [1]) so now when I see
 an atom of that type [2], I add an extra H bonded to the N and
 everything is then peachy.

What you are describing sounds correct. One possible, minor,
optimization if you are building the molecule atom by atom (as opposed
to reading it from a mol block): you don't actually need to put the H
in the graph. You can instead do something like the following:
[6] m = Chem.MolFromSmiles('c1cccn1',sanitize=False)
[7] m.GetAtomWithIdx(4).SetNumExplicitHs(1)
[8] Chem.SanitizeMol(m)
[9] print Chem.MolToSmiles(m)
- print(Chem.MolToSmiles(m))
c1cc[nH]c1


 Thanks again,

you're welcome!

Best Regards,
-greg

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] kekulizing carbazole

2010-10-31 Thread Paul Emsley

Hi,

I'm running into problems when I try to kekulize carbazole.

The description I start with is that all the bonds are marked as 
Bond::AROMATIC and I do setIsAromatic(true) on all the atoms (which are 
all non-hydrogens).  The explicitValence() for the N is 3.

MolOps::Kekulize() fails in that case, Can't kekulize mol.

If I add single bonds to hydrogens (including a hydrogens on the N) then 
MolOps::Kekulize() works.

So my question is, how should I adjust the molecule description in the 
first case so that MolOps::Kekulize() works without hydrogens too?

Thanks,

Paul.


--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] kekulizing carbazole

2010-10-31 Thread Greg Landrum
Hi Paul,

On Sun, Oct 31, 2010 at 12:09 PM, Paul Emsley
paul.ems...@bioch.ox.ac.uk wrote:


 I'm running into problems when I try to kekulize carbazole.

 The description I start with is that all the bonds are marked as
 Bond::AROMATIC and I do setIsAromatic(true) on all the atoms (which are
 all non-hydrogens).  The explicitValence() for the N is 3.

 MolOps::Kekulize() fails in that case, Can't kekulize mol.

 If I add single bonds to hydrogens (including a hydrogens on the N) then
 MolOps::Kekulize() works.

 So my question is, how should I adjust the molecule description in the
 first case so that MolOps::Kekulize() works without hydrogens too?

The problem is probably the lack of an explicit H on the nitrogen
atom. It's easily demonstrated with pyrrole:

[2] m=Chem.MolFromSmiles('c1cccn1')
[15:10:05] Can't kekulize mol

You can fix this by letting the RDKit know that there's an H on the N atom:
[3] m=Chem.MolFromSmiles('c1ccc[nH]1')
[4]

Carbazole is the same story:
[4] m=Chem.MolFromSmiles('c1ccc2c(c1)[nH]c1c21')
[5]

Note that in either case if you provide the structure in its Kekule
form this doesn't happen, here's the illustration for pyrrole:
[5] m=Chem.MolFromSmiles('C1=CC=CN1')
[6] Chem.MolToSmiles(m)
Out[6] 'c1cc[nH]c1'

There's an argument to be made that the Kekulization code could be
made more robust with respect to this particular edge case, but to
this point the effort involved has not seem justified by the payoff:
most of the time the H is present in the SMILES, so this problem
doesn't occur.

Best Regards,
-greg

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss