Re: [Rdkit-discuss] Another Can't kekulize mol observation

2017-04-27 Thread Markus Metz
Hi Greg:

Your suggestions pointed me into the right direction. Thank you very much.

I tried your example but it seems (?) that the method GetAtomWithIdx() is
not available for edit_mol.
So I stored a neighbor atom which is a N atom, is aromatic and does not
belong to the same ring. I then applied the setnumexplicitHs method on the
mol.

Cheers,
Markus


On Thu, Apr 27, 2017 at 8:07 AM, Greg Landrum 
wrote:

> Hi Markus,
>
> The general rule of thumb is that if you remove an exocyclic neighbor from
> an aromatic heteroatom you need to add an "explicit H" to the heteroatom.
> Here's a modification of one of your pieces of code that adds that H as an
> atom that's actually in the graph:
>
> # use ReplaceAtom:
> Hatom = Chem.MolFromSmiles('[H]').GetAtomWithIdx(0)
>
> atidx = 8
>
> edit_mol = Chem.EditableMol(mol)
>
> edit_mol.ReplaceAtom(atidx,Hatom)
> scaffold = edit_mol.GetMol()
>
> scaffold_smiles = Chem.MolToSmiles(scaffold)
> print(scaffold_smiles)
>
> The change relative to what you were doing is the use of MolFromSmiles()
> to get the H.
>
> A more efficient approach that has the advantage of not leaving extra H
> atoms in the molecule that then need to be removed is to add the "explicit
> H" to the atom:
>
> atidx = 8
> nbrIdx = 7
>
> edit_mol = Chem.RWMol(mol)
>
> edit_mol.RemoveAtom(atidx)
> edit_mol.GetAtomWithIdx(nbrIdx).SetNumExplicitHs(1)
> scaffold = edit_mol.GetMol()
>
> scaffold_smiles = Chem.MolToSmiles(scaffold)
> print(scaffold_smiles)
>
> This produces:
>
> c1ccc(cc1)-c1n[nH]c(n1)-c1c1
>
>
> I hope that helps
> -greg
>
>
> On Thu, Apr 27, 2017 at 4:53 PM, Markus Metz  wrote:
>
>> Hello all:
>>
>> Thank you very much for your messages.
>>
>> As I would like to process many molecules manually editing smiles is
>> unfortunately not an option.
>>
>> Therefore I tried to automatize this step using the method ReplaceAtom of
>> the class EditableMol.
>> I defined an Hatom and tried to use it. Upon executing attached notebook
>> the input molecule is unchanged.
>>
>> Do you have another suggestions which might help answer my question?
>>
>> Best,
>> Markus
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Apr 26, 2017 at 11:46 PM, Peter S. Shenkin 
>> wrote:
>>
>>> I would just replace 'n' with '[nH]' in your existing SMILES, for the N
>>> you want the H on.
>>>
>>> -P.
>>>
>>> On Thu, Apr 27, 2017 at 12:32 AM, Hongbin Yang 
>>> wrote:
>>>
 Hi Markus,
 “c1ccc(cc1)-c1nnc(n1)-c1c1” is different from 
 "c1ccc(cc1)-c1nncn1-c1c1",
 so you cannot remove the parentheses.

 The error "Can't kekulize mol." is caused by the triazole in your
 molecule.

 "c1nncn1" tells that the molecule is aromatic, but it do not tell
 where the H is.

 For example,  "C1=NN=CN1" is "4H-1,2,4-triazole" and "C1=NC=NN1" is 
 1H-1,2,4-triazole.
 They are different in Kekulize but both of them can represented by 
 "c1nncn1"

 There's two solutions I suggest:
 1. use `Chem.MolFromSmiles('c1ccc(cc1)-c1nnc(n1)-c1c1',False)`
 (reference: http://www.rdkit.org/docs/api/rdkit.Chem.rdmolfi
 les-module.html#MolFromSmiles)

 2. Manually Kekulize it: 
 `Chem.MolFromSmiles('c1ccc(cc1)-C1=NN=C(N1)-c1c1')`
 . This indicate the H is on the 4'N.


 --
 Hongbin Yang


 *From:* Markus Metz 
 *Date:* 2017-04-27 09:30
 *To:* RDKit Discuss 
 *Subject:* [Rdkit-discuss] Another Can't kekulize mol observation
 Hello all:

 I obtained this smiles string:
 c1ccc(cc1)-c1nnc(n1)-c1c1
 by removing atoms from the n1 in parentheses.

 Using:
 mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nnc(n1)-c1c1")
 throws an error: Can't kekulize mol.

 Using
 mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nncn1-c1c1")
 works fine.

 Is there any workaround?
 Any input is highly appreciated.

 Cheers,
 Markus


 
 --
 Check out the vibrant tech community on one of the world's most
 engaging tech sites, Slashdot.org! http://sdm.link/slashdot
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


>>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>

Re: [Rdkit-discuss] Another Can't kekulize mol observation

2017-04-27 Thread Greg Landrum
Hi Markus,

The general rule of thumb is that if you remove an exocyclic neighbor from
an aromatic heteroatom you need to add an "explicit H" to the heteroatom.
Here's a modification of one of your pieces of code that adds that H as an
atom that's actually in the graph:

# use ReplaceAtom:
Hatom = Chem.MolFromSmiles('[H]').GetAtomWithIdx(0)

atidx = 8

edit_mol = Chem.EditableMol(mol)

edit_mol.ReplaceAtom(atidx,Hatom)
scaffold = edit_mol.GetMol()

scaffold_smiles = Chem.MolToSmiles(scaffold)
print(scaffold_smiles)

The change relative to what you were doing is the use of MolFromSmiles() to
get the H.

A more efficient approach that has the advantage of not leaving extra H
atoms in the molecule that then need to be removed is to add the "explicit
H" to the atom:

atidx = 8
nbrIdx = 7

edit_mol = Chem.RWMol(mol)

edit_mol.RemoveAtom(atidx)
edit_mol.GetAtomWithIdx(nbrIdx).SetNumExplicitHs(1)
scaffold = edit_mol.GetMol()

scaffold_smiles = Chem.MolToSmiles(scaffold)
print(scaffold_smiles)

This produces:

c1ccc(cc1)-c1n[nH]c(n1)-c1c1


I hope that helps
-greg


On Thu, Apr 27, 2017 at 4:53 PM, Markus Metz  wrote:

> Hello all:
>
> Thank you very much for your messages.
>
> As I would like to process many molecules manually editing smiles is
> unfortunately not an option.
>
> Therefore I tried to automatize this step using the method ReplaceAtom of
> the class EditableMol.
> I defined an Hatom and tried to use it. Upon executing attached notebook
> the input molecule is unchanged.
>
> Do you have another suggestions which might help answer my question?
>
> Best,
> Markus
>
>
>
>
>
>
>
> On Wed, Apr 26, 2017 at 11:46 PM, Peter S. Shenkin 
> wrote:
>
>> I would just replace 'n' with '[nH]' in your existing SMILES, for the N
>> you want the H on.
>>
>> -P.
>>
>> On Thu, Apr 27, 2017 at 12:32 AM, Hongbin Yang 
>> wrote:
>>
>>> Hi Markus,
>>> “c1ccc(cc1)-c1nnc(n1)-c1c1” is different from 
>>> "c1ccc(cc1)-c1nncn1-c1c1",
>>> so you cannot remove the parentheses.
>>>
>>> The error "Can't kekulize mol." is caused by the triazole in your
>>> molecule.
>>>
>>> "c1nncn1" tells that the molecule is aromatic, but it do not tell where
>>> the H is.
>>>
>>> For example,  "C1=NN=CN1" is "4H-1,2,4-triazole" and "C1=NC=NN1" is 
>>> 1H-1,2,4-triazole.
>>> They are different in Kekulize but both of them can represented by "c1nncn1"
>>>
>>> There's two solutions I suggest:
>>> 1. use `Chem.MolFromSmiles('c1ccc(cc1)-c1nnc(n1)-c1c1',False)`
>>> (reference: http://www.rdkit.org/docs/api/rdkit.Chem.rdmolfi
>>> les-module.html#MolFromSmiles)
>>>
>>> 2. Manually Kekulize it: 
>>> `Chem.MolFromSmiles('c1ccc(cc1)-C1=NN=C(N1)-c1c1')`
>>> . This indicate the H is on the 4'N.
>>>
>>>
>>> --
>>> Hongbin Yang
>>>
>>>
>>> *From:* Markus Metz 
>>> *Date:* 2017-04-27 09:30
>>> *To:* RDKit Discuss 
>>> *Subject:* [Rdkit-discuss] Another Can't kekulize mol observation
>>> Hello all:
>>>
>>> I obtained this smiles string:
>>> c1ccc(cc1)-c1nnc(n1)-c1c1
>>> by removing atoms from the n1 in parentheses.
>>>
>>> Using:
>>> mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nnc(n1)-c1c1")
>>> throws an error: Can't kekulize mol.
>>>
>>> Using
>>> mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nncn1-c1c1")
>>> works fine.
>>>
>>> Is there any workaround?
>>> Any input is highly appreciated.
>>>
>>> Cheers,
>>> Markus
>>>
>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Another Can't kekulize mol observation

2017-04-27 Thread Markus Metz
Hello all:

Thank you very much for your messages.

As I would like to process many molecules manually editing smiles is
unfortunately not an option.

Therefore I tried to automatize this step using the method ReplaceAtom of
the class EditableMol.
I defined an Hatom and tried to use it. Upon executing attached notebook
the input molecule is unchanged.

Do you have another suggestions which might help answer my question?

Best,
Markus







On Wed, Apr 26, 2017 at 11:46 PM, Peter S. Shenkin 
wrote:

> I would just replace 'n' with '[nH]' in your existing SMILES, for the N
> you want the H on.
>
> -P.
>
> On Thu, Apr 27, 2017 at 12:32 AM, Hongbin Yang 
> wrote:
>
>> Hi Markus,
>> “c1ccc(cc1)-c1nnc(n1)-c1c1” is different from 
>> "c1ccc(cc1)-c1nncn1-c1c1",
>> so you cannot remove the parentheses.
>>
>> The error "Can't kekulize mol." is caused by the triazole in your
>> molecule.
>>
>> "c1nncn1" tells that the molecule is aromatic, but it do not tell where
>> the H is.
>>
>> For example,  "C1=NN=CN1" is "4H-1,2,4-triazole" and "C1=NC=NN1" is 
>> 1H-1,2,4-triazole.
>> They are different in Kekulize but both of them can represented by "c1nncn1"
>>
>> There's two solutions I suggest:
>> 1. use `Chem.MolFromSmiles('c1ccc(cc1)-c1nnc(n1)-c1c1',False)`
>> (reference: http://www.rdkit.org/docs/api/rdkit.Chem.rdmolfi
>> les-module.html#MolFromSmiles)
>>
>> 2. Manually Kekulize it: 
>> `Chem.MolFromSmiles('c1ccc(cc1)-C1=NN=C(N1)-c1c1')`
>> . This indicate the H is on the 4'N.
>>
>>
>> --
>> Hongbin Yang
>>
>>
>> *From:* Markus Metz 
>> *Date:* 2017-04-27 09:30
>> *To:* RDKit Discuss 
>> *Subject:* [Rdkit-discuss] Another Can't kekulize mol observation
>> Hello all:
>>
>> I obtained this smiles string:
>> c1ccc(cc1)-c1nnc(n1)-c1c1
>> by removing atoms from the n1 in parentheses.
>>
>> Using:
>> mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nnc(n1)-c1c1")
>> throws an error: Can't kekulize mol.
>>
>> Using
>> mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nncn1-c1c1")
>> works fine.
>>
>> Is there any workaround?
>> Any input is highly appreciated.
>>
>> Cheers,
>> Markus
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>


Cannot_Kekulize.ipynb
Description: Binary data
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Another Can't kekulize mol observation

2017-04-27 Thread Peter S. Shenkin
I would just replace 'n' with '[nH]' in your existing SMILES, for the N you
want the H on.

-P.

On Thu, Apr 27, 2017 at 12:32 AM, Hongbin Yang  wrote:

> Hi Markus,
> “c1ccc(cc1)-c1nnc(n1)-c1c1” is different from 
> "c1ccc(cc1)-c1nncn1-c1c1",
> so you cannot remove the parentheses.
>
> The error "Can't kekulize mol." is caused by the triazole in your
> molecule.
>
> "c1nncn1" tells that the molecule is aromatic, but it do not tell where
> the H is.
>
> For example,  "C1=NN=CN1" is "4H-1,2,4-triazole" and "C1=NC=NN1" is 
> 1H-1,2,4-triazole.
> They are different in Kekulize but both of them can represented by "c1nncn1"
>
> There's two solutions I suggest:
> 1. use `Chem.MolFromSmiles('c1ccc(cc1)-c1nnc(n1)-c1c1',False)`
> (reference: http://www.rdkit.org/docs/api/rdkit.Chem.
> rdmolfiles-module.html#MolFromSmiles)
>
> 2. Manually Kekulize it: 
> `Chem.MolFromSmiles('c1ccc(cc1)-C1=NN=C(N1)-c1c1')`
> . This indicate the H is on the 4'N.
>
>
> --
> Hongbin Yang
>
>
> *From:* Markus Metz 
> *Date:* 2017-04-27 09:30
> *To:* RDKit Discuss 
> *Subject:* [Rdkit-discuss] Another Can't kekulize mol observation
> Hello all:
>
> I obtained this smiles string:
> c1ccc(cc1)-c1nnc(n1)-c1c1
> by removing atoms from the n1 in parentheses.
>
> Using:
> mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nnc(n1)-c1c1")
> throws an error: Can't kekulize mol.
>
> Using
> mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nncn1-c1c1")
> works fine.
>
> Is there any workaround?
> Any input is highly appreciated.
>
> Cheers,
> Markus
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Another Can't kekulize mol observation

2017-04-26 Thread Hongbin Yang






Hi Markus,“c1ccc(cc1)-c1nnc(n1)-c1c1” is different from 
"c1ccc(cc1)-c1nncn1-c1c1", so you cannot remove the parentheses.
The error "Can't kekulize mol." is caused by the triazole in your molecule.
"c1nncn1" tells that the molecule is aromatic, but it do not tell where the H 
is.
For example,  "C1=NN=CN1" is "4H-1,2,4-triazole" and "C1=NC=NN1" is 
1H-1,2,4-triazole. They are different in Kekulize but both of them can 
represented by "c1nncn1"
There's two solutions I suggest:1. use 
`Chem.MolFromSmiles('c1ccc(cc1)-c1nnc(n1)-c1c1',False)` (reference: 
http://www.rdkit.org/docs/api/rdkit.Chem.rdmolfiles-module.html#MolFromSmiles) 
2. Manually Kekulize it: 
`Chem.MolFromSmiles('c1ccc(cc1)-C1=NN=C(N1)-c1c1')` . This indicate the H 
is on the 4'N.



Hongbin Yang 

 From: Markus MetzDate: 2017-04-27 09:30To: RDKit DiscussSubject: 
[Rdkit-discuss] Another Can't kekulize mol observationHello all:
I obtained this smiles string:c1ccc(cc1)-c1nnc(n1)-c1c1by removing atoms 
from the n1 in parentheses.
Using:mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nnc(n1)-c1c1")throws an error: 
Can't kekulize mol.
Using mol = Chem.MolFromSmiles("c1ccc(cc1)-c1nncn1-c1c1")
works fine.
Is there any workaround?Any input is highly appreciated.
Cheers,Markus


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss