[Rdkit-discuss] adding custom number of explicit H to specified non-hydrogen atoms

2017-01-20 Thread Janusz Petkowski
Dear RDKit Community,

By default H atoms are not explicit in the molecular graph and because of that 
the substructure matching is ignoring them when searching for substructures. It 
is possible to use Chem.AddHs(mol) to add explicit hydrogens to all atoms in 
the molecule and then perform substructure matching but is it possible, in 
RDkit, to add explicit hydrogens specifically to atoms of choice instead to all 
of them?

So let's say if I do:

m1 = Chem.MolFromSmiles('C=C')
m1_H = Chem.AddHs(m1)
print m1_H.GetNumAtoms()
print Chem.MolToSmiles(m1_H)

The result is:

>>> 6
>>> [H]C([H])=C([H])[H]

What if I would like to add only one (1)  explicit hydrogen atom to a specific 
non-hydrogen atom (let's say m1.GetAtomWithIdx(0). In that case I would want to 
have:

print m1_H.GetNumAtoms()
print Chem.MolToSmiles(m1_H)

>>> 3
>>> [H]C=C

I tried to use the following method: m1.GetAtomWithIdx(0).SetNumExplicitHs(1) 
which correctly adds an explicit H to C=C molecule but somehow I cannot convert 
it to smiles with this one additional explicit H added or subsequently use  for 
substructure matching.

At the end I would like to do a substructure matching where the following query 
structures:


[H]C=C or [H]C=CC match the following molecule: 
[H]C(=C([H])C([H])([H])[H])C([H])([H])[H]

but at the same time those query structures: [H]C=C([H])[H] or [H]C([H])=CC do 
not match [H]C(=C([H])C([H])([H])[H])C([H])([H])[H]

PS. Of course, the structure [H]C([H])=C([H])[H] converted from C=C using 
Chem.AddHs(mol) will not be matched onto 
[H]C(=C([H])C([H])([H])[H])C([H])([H])[H] which is correct.

Thank you very much for your help,

Best regards,

Janusz Petkowski

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit "cannot create mol from SMILE" error

2017-01-20 Thread Larson Danes
This all explains a lot. Thanks very much, Brian and Peter!

On Wed, Jan 18, 2017 at 6:05 PM, Peter S. Shenkin  wrote:

> In addition to Brian's observation, there is also a "C1" early in the
> SMILES, but no corresponding X1 to make a ring bond before or after it.
>
> It appears that you might be reading the second half of a SMILES for some
> reason. My guess is that the (C=C1) is associated with a preceding atom
> that was not read.
>
> -P
>
> On Wed, Jan 18, 2017 at 6:32 PM, Brian Kelley 
> wrote:
>
>> That doesn't look like a valid SMILES to me, I don't think a think a
>> smiles string can start with a parenthesis ( branch ).
>>
>> 
>> Brian Kelley
>>
>> On Jan 18, 2017, at 6:18 PM, Larson Danes  wrote:
>>
>> Hi all,
>>
>> I'm using the following query in postgresql (with the rdkit extension
>> installed):
>>
>> "select casrn from mols where m @> CAST(? AS mol)"
>>
>>
>> This returns "ERROR: could not create molecule from SMILES '...' " on 
>> occasion. One such SMILE that causes this error regularly is 
>> '(C=C1)[N+]([O-])=O'. I'm curious if there's documentation on this specific 
>> error message anywhere. I've looked and haven't had luck finding any.
>>
>> Any information about this error message is much appreciated.
>>
>>
>> Thanks,
>>
>>
>> Larson
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss