Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-15 Thread Michal Krompiec
Hi Michal,
Hmm, wouldn't this do exactly what you wanted?
sdf_in=Chem.SDMolSupplier(in_file, removeHs=False)

Best wishes,
Michal

On 15 April 2015 at 16:40, Michał Nowotka  wrote:

> Hi,
>
> I have a question, directly related to this topic: I would like to
> achieve the following behavior from RDKit:
>
> 1. If input molfile has explicit hydrogens defined, then perform only
> partial sanitization and keep hydrogens.
> 2. Otherwise perform full sanitization as usual.
>
> Will that happen automatically or do I need to check if the molfile
> has explicit hydrogens? If the latter, how can I check this using
> RDKit without writing my own SDF parser?
>
> Michał Nowotka
>
> On Tue, Apr 7, 2015 at 11:50 AM, Greg Landrum 
> wrote:
> >
> > On Tue, Apr 7, 2015 at 12:13 PM, Michal Krompiec <
> michal.kromp...@gmail.com>
> > wrote:
> >>
> >> Thanks a lot!
> >
> >
> > Glad it works.
> >
> >>
> >> By the way, it would be useful to have this feature (MergeQueryHs) also
> in
> >> the substructure search KNIME node.
> >
> >
> > Indeed. :-)
> > We already have a version of the MolToRDKit node ready to allow this; it
> > just hasn't made it into the open-source version of the KNIME nodes yet
> (we
> > forgot to include this the last time we updated those nodes). It's coming
> > soon.
> >
> > -greg
> >
> >
> >
> --
> > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
> > Develop your own process in accordance with the BPMN 2 standard
> > Learn Process modeling best practices with Bonita BPM through live
> exercises
> > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
> event?utm_
> > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
> > ___
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >
>
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-15 Thread Michał Nowotka
Hi,

I have a question, directly related to this topic: I would like to
achieve the following behavior from RDKit:

1. If input molfile has explicit hydrogens defined, then perform only
partial sanitization and keep hydrogens.
2. Otherwise perform full sanitization as usual.

Will that happen automatically or do I need to check if the molfile
has explicit hydrogens? If the latter, how can I check this using
RDKit without writing my own SDF parser?

Michał Nowotka

On Tue, Apr 7, 2015 at 11:50 AM, Greg Landrum  wrote:
>
> On Tue, Apr 7, 2015 at 12:13 PM, Michal Krompiec 
> wrote:
>>
>> Thanks a lot!
>
>
> Glad it works.
>
>>
>> By the way, it would be useful to have this feature (MergeQueryHs) also in
>> the substructure search KNIME node.
>
>
> Indeed. :-)
> We already have a version of the MolToRDKit node ready to allow this; it
> just hasn't made it into the open-source version of the KNIME nodes yet (we
> forgot to include this the last time we updated those nodes). It's coming
> soon.
>
> -greg
>
>
> --
> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
> Develop your own process in accordance with the BPMN 2 standard
> Learn Process modeling best practices with Bonita BPM through live exercises
> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-07 Thread Greg Landrum
On Tue, Apr 7, 2015 at 12:13 PM, Michal Krompiec 
wrote:

> Thanks a lot!
>

Glad it works.


> By the way, it would be useful to have this feature (MergeQueryHs) also in
> the substructure search KNIME node.
>

Indeed. :-)
We already have a version of the MolToRDKit node ready to allow this; it
just hasn't made it into the open-source version of the KNIME nodes yet (we
forgot to include this the last time we updated those nodes). It's coming
soon.

-greg
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-07 Thread Michal Krompiec
Thanks a lot!
By the way, it would be useful to have this feature (MergeQueryHs) also in
the substructure search KNIME node.
Best wishes,
Michal
On 3 April 2015 at 06:29, Greg Landrum  wrote:

> The changes are now pushed (
> https://github.com/rdkit/rdkit/commit/f0d4cf1ec63a4928a2a28fa62cf1d255099e72d0)
> and are available on master.
> The new functions are qmol_from_smiles() and qmol_from_ctab()
>
> Best,
> -greg
>
>
> On Thu, Apr 2, 2015 at 10:51 AM, Greg Landrum 
> wrote:
>
>> Hi Michal,
>>
>> Glad to hear this matches what you are looking for. I have already added
>> the feature to the cartridge and will check it in later today/tomorrow
>> morning.
>>
>> -greg
>>
>> On Thursday, April 2, 2015, Michal Krompiec 
>> wrote:
>>
>>>  Hi Greg,
>>> Thank you, this is exactly what I needed.
>>>
>>> On 2 April 2015 at 05:22, Greg Landrum  wrote:
>>>

 Skipping sanitization, as you propose, isn't going to help here: the
 kekulized form of the ring will not be converted to aromatic and you won't
 get the matches you are looking for.

>>> Indeed. Previously I stored my dataset as smiles with explicit
>>> hydrogens, and created the query mols by adding Hs and then deleting
>>> hydrogens at substitution sites and finally converting to SMARTS - a messy
>>> workaround, but producing the right result.
>>>
>>>
   Here's an approach to this that works in Python :

>>>
>>>
>>> And this is exactly what I wanted. To illustrate it more precisely: your
>>> pattern (2-H-pyrimidine) matches pyrimidine, 5-methylpyrimidine but does
>>> not match 2-pyrimidine:
>>>
>>> >>> m =Chem.MolFromSmiles('c1ccnc([H])n1',sanitize=False);
>>> >>> nm=Chem.MergeQueryHs(m)
>>> >>> Chem.SanitizeMol(nm)
>>> rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>>> >>> Chem.MolFromSmiles('c1ccncn1').HasSubstructMatch(nm)
>>> True
>>> >>> Chem.MolFromSmiles('c1c(C)cncn1').HasSubstructMatch(nm)
>>> True
>>> >>> Chem.MolFromSmiles('c1ccnc(C)n1').HasSubstructMatch(nm)
>>> False
>>> >>>
>>>
>>>
>>>

  Being able to do something equivalent in the cartridge would
 certainly be useful. What I'd suggest is the addition of two functions:
 "query_mol_from_smiles()" and "query_mol_from_ctab()" that do this.

>>>
>>> I'll do it.
>>>
>>>
>>>   Then you could do queries like:
 select * from mols where m @> query_mol_from_smiles('c1ccnc([H])n1');
 and have it do the right thing.

 Sound reasonable?

 -greg


>>> Best wishes,
>>> Michal
>>>
>>>
>
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-02 Thread Greg Landrum
The changes are now pushed (
https://github.com/rdkit/rdkit/commit/f0d4cf1ec63a4928a2a28fa62cf1d255099e72d0)
and are available on master.
The new functions are qmol_from_smiles() and qmol_from_ctab()

Best,
-greg


On Thu, Apr 2, 2015 at 10:51 AM, Greg Landrum 
wrote:

> Hi Michal,
>
> Glad to hear this matches what you are looking for. I have already added
> the feature to the cartridge and will check it in later today/tomorrow
> morning.
>
> -greg
>
> On Thursday, April 2, 2015, Michal Krompiec 
> wrote:
>
>> Hi Greg,
>> Thank you, this is exactly what I needed.
>>
>> On 2 April 2015 at 05:22, Greg Landrum  wrote:
>>
>>>
>>> Skipping sanitization, as you propose, isn't going to help here: the
>>> kekulized form of the ring will not be converted to aromatic and you won't
>>> get the matches you are looking for.
>>>
>> Indeed. Previously I stored my dataset as smiles with explicit hydrogens,
>> and created the query mols by adding Hs and then deleting hydrogens at
>> substitution sites and finally converting to SMARTS - a messy workaround,
>> but producing the right result.
>>
>>
>>> Here's an approach to this that works in Python :
>>>
>>
>>
>> And this is exactly what I wanted. To illustrate it more precisely: your
>> pattern (2-H-pyrimidine) matches pyrimidine, 5-methylpyrimidine but does
>> not match 2-pyrimidine:
>>
>> >>> m =Chem.MolFromSmiles('c1ccnc([H])n1',sanitize=False);
>> >>> nm=Chem.MergeQueryHs(m)
>> >>> Chem.SanitizeMol(nm)
>> rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>> >>> Chem.MolFromSmiles('c1ccncn1').HasSubstructMatch(nm)
>> True
>> >>> Chem.MolFromSmiles('c1c(C)cncn1').HasSubstructMatch(nm)
>> True
>> >>> Chem.MolFromSmiles('c1ccnc(C)n1').HasSubstructMatch(nm)
>> False
>> >>>
>>
>>
>>
>>>
>>> Being able to do something equivalent in the cartridge would certainly
>>> be useful. What I'd suggest is the addition of two functions:
>>> "query_mol_from_smiles()" and "query_mol_from_ctab()" that do this.
>>>
>>
>> I'll do it.
>>
>>
>> Then you could do queries like:
>>> select * from mols where m @> query_mol_from_smiles('c1ccnc([H])n1');
>>> and have it do the right thing.
>>>
>>> Sound reasonable?
>>>
>>> -greg
>>>
>>>
>> Best wishes,
>> Michal
>>
>>
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-02 Thread Greg Landrum
Hi Michal,

Glad to hear this matches what you are looking for. I have already added
the feature to the cartridge and will check it in later today/tomorrow
morning.

-greg

On Thursday, April 2, 2015, Michal Krompiec 
wrote:

> Hi Greg,
> Thank you, this is exactly what I needed.
>
> On 2 April 2015 at 05:22, Greg Landrum  > wrote:
>
>>
>> Skipping sanitization, as you propose, isn't going to help here: the
>> kekulized form of the ring will not be converted to aromatic and you won't
>> get the matches you are looking for.
>>
> Indeed. Previously I stored my dataset as smiles with explicit hydrogens,
> and created the query mols by adding Hs and then deleting hydrogens at
> substitution sites and finally converting to SMARTS - a messy workaround,
> but producing the right result.
>
>
>> Here's an approach to this that works in Python :
>>
>
>
> And this is exactly what I wanted. To illustrate it more precisely: your
> pattern (2-H-pyrimidine) matches pyrimidine, 5-methylpyrimidine but does
> not match 2-pyrimidine:
>
> >>> m =Chem.MolFromSmiles('c1ccnc([H])n1',sanitize=False);
> >>> nm=Chem.MergeQueryHs(m)
> >>> Chem.SanitizeMol(nm)
> rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
> >>> Chem.MolFromSmiles('c1ccncn1').HasSubstructMatch(nm)
> True
> >>> Chem.MolFromSmiles('c1c(C)cncn1').HasSubstructMatch(nm)
> True
> >>> Chem.MolFromSmiles('c1ccnc(C)n1').HasSubstructMatch(nm)
> False
> >>>
>
>
>
>>
>> Being able to do something equivalent in the cartridge would certainly be
>> useful. What I'd suggest is the addition of two functions:
>> "query_mol_from_smiles()" and "query_mol_from_ctab()" that do this.
>>
>
> I'll do it.
>
>
> Then you could do queries like:
>> select * from mols where m @> query_mol_from_smiles('c1ccnc([H])n1');
>> and have it do the right thing.
>>
>> Sound reasonable?
>>
>> -greg
>>
>>
> Best wishes,
> Michal
>
>
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-02 Thread Michal Krompiec
Hi Greg,
Thank you, this is exactly what I needed.

On 2 April 2015 at 05:22, Greg Landrum  wrote:

>
> Skipping sanitization, as you propose, isn't going to help here: the
> kekulized form of the ring will not be converted to aromatic and you won't
> get the matches you are looking for.
>
Indeed. Previously I stored my dataset as smiles with explicit hydrogens,
and created the query mols by adding Hs and then deleting hydrogens at
substitution sites and finally converting to SMARTS - a messy workaround,
but producing the right result.


> Here's an approach to this that works in Python :
>


And this is exactly what I wanted. To illustrate it more precisely: your
pattern (2-H-pyrimidine) matches pyrimidine, 5-methylpyrimidine but does
not match 2-pyrimidine:

>>> m =Chem.MolFromSmiles('c1ccnc([H])n1',sanitize=False);
>>> nm=Chem.MergeQueryHs(m)
>>> Chem.SanitizeMol(nm)
rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>>> Chem.MolFromSmiles('c1ccncn1').HasSubstructMatch(nm)
True
>>> Chem.MolFromSmiles('c1c(C)cncn1').HasSubstructMatch(nm)
True
>>> Chem.MolFromSmiles('c1ccnc(C)n1').HasSubstructMatch(nm)
False
>>>



>
> Being able to do something equivalent in the cartridge would certainly be
> useful. What I'd suggest is the addition of two functions:
> "query_mol_from_smiles()" and "query_mol_from_ctab()" that do this.
>

I'll do it.


Then you could do queries like:
> select * from mols where m @> query_mol_from_smiles('c1ccnc([H])n1');
> and have it do the right thing.
>
> Sound reasonable?
>
> -greg
>
>
Best wishes,
Michal
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-01 Thread Greg Landrum
Hi Michal,

On Wed, Apr 1, 2015 at 10:51 AM, Michal Krompiec 
wrote:

> Hi Greg,
> Is it possible to do the same (i.e. create a molecule from SMILES without
> removing explicit hydrogens) in the postgresql cartridge? I would like to
> do a "restricted" substructure search using SMILES queries.
>

I think I understand your use case.


> For example, with the standard behaviour (hydrogens removed),
> c1c1[CH3] is converted to c1c1C and matches TNT and benzaldehyde,
> whereas if the hydrogens are not removed, this SMILES query would match TNT
> but not benzaldehyde. Of course, this can be done with SMARTS but SMILES
> with explicit hydrogens can be drawn in MarvinSketch in KNIME by a
> non-expert user.
>

Without getting overly into terminology, it sounds to me like you want
people to be able to draw something corresponding to
"C1=CC=CC=C1C([H])([H])[H]" in a sketcher, convert that to SMILES, and have
the query constructed from that SMILES match toluene but not ethyl-benzene
or benzaldehyde. Going via SMARTS here does not work because
[#6]-1=[#6]-[#6]=[#6]-[#6]=[#6]-1[H]C([H])[H] doesn't match much of
anything.

Skipping sanitization, as you propose, isn't going to help here: the
kekulized form of the ring will not be converted to aromatic and you won't
get the matches you are looking for.

Here's an approach to this that works in Python :

In [8]: m =Chem.MolFromSmiles('c1ccnc([H])n1',sanitize=False)

In [9]: nm=Chem.MergeQueryHs(m)

In [10]: Chem.SanitizeMol(nm)
Out[10]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [11]: Chem.MolFromSmiles('c1ccncn1').HasSubstructMatch(nm)
Out[11]: True

In [12]: Chem.MolFromSmiles('c1ccnc(C)n1').HasSubstructMatch(nm)
Out[12]: False


Notice the MergeQueryHs() step; that's essential unless you are storing
molecules in the database with Hs attached (pretty unlikely).

Being able to do something equivalent in the cartridge would certainly be
useful. What I'd suggest is the addition of two functions:
"query_mol_from_smiles()" and "query_mol_from_ctab()" that do this.

Then you could do queries like:
select * from mols where m @> query_mol_from_smiles('c1ccnc([H])n1');
and have it do the right thing.

Sound reasonable?

-greg
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-01 Thread Michal Krompiec
Hi Greg,
Is it possible to do the same (i.e. create a molecule from SMILES without
removing explicit hydrogens) in the postgresql cartridge? I would like to
do a "restricted" substructure search using SMILES queries.

For example, with the standard behaviour (hydrogens removed), c1c1[CH3]
is converted to c1c1C and matches TNT and benzaldehyde, whereas if the
hydrogens are not removed, this SMILES query would match TNT but not
benzaldehyde. Of course, this can be done with SMARTS but SMILES with
explicit hydrogens can be drawn in MarvinSketch in KNIME by a non-expert
user.

It seems that this is not possible currently in the cartridge and would
require:
- exposing sanitize parameter in parseMolText in adapter.cpp
- adding a modified mol_to_smiles to rdkit_io.c and rdkit.sql(91).in
Am I right or is there a simpler way of doing it?

Best wishes,
Michal


On 25 February 2014 at 04:23, Greg Landrum  wrote:

> Hi Michal,
>
> On Mon, Feb 24, 2014 at 4:48 PM, Michal Krompiec <
> michal.kromp...@gmail.com> wrote:
>
>> Hello, I have just noticed this:
>> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]"))
>> 'c1ccsc1'
>> >>>
>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False))
>> '[H]c1sc([H])c([H])c1[H]'
>> >>>
>> Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)))
>> 'c1ccsc1'
>> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]"))
>> 'c1ccsc1'
>> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]",sanitize=False))
>> '[H]c1cscc1[H]'
>>
>> Is it the expected behaviour? Why does sanitization remove hydrogens?
>
> Is it controlled by any of the SanitizeFlags?
>>
>
>  It is the expected behavior. When sanitization is turned on, the SMILES
> parser actually calls "RemoveHs"; this removes the hydrogens from the graph
> and then sanitizes the molecule.
>
> If you do not want the Hs removed, you can tell MolFromSmiles to skip the
> sanitization (which also skips the RemoveHs) and then sanitize yourself::
>
> In [3]: m=Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)
>
> In [4]: Chem.SanitizeMol(m)
> Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> In [5]: print Chem.MolToSmiles(m)
> [H]c1sc([H])c([H])c1[H]
>
> I hope this helps,
> -greg
>
>
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2014-02-25 Thread Michal Krompiec
Thanks Greg, this is exactly what I wanted to know. Would you consider
adding an optional removeHs argument to MolFromSmiles(), as in
mol/mol2/sdf parsers?
Best wishes,
Michal

On 25 February 2014 04:23, Greg Landrum  wrote:
> Hi Michal,
>
> On Mon, Feb 24, 2014 at 4:48 PM, Michal Krompiec 
> wrote:
>>
>> Hello, I have just noticed this:
>> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]"))
>> 'c1ccsc1'
>> >>>
>> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False))
>> '[H]c1sc([H])c([H])c1[H]'
>> >>>
>> >>> Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)))
>> 'c1ccsc1'
>> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]"))
>> 'c1ccsc1'
>> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]",sanitize=False))
>> '[H]c1cscc1[H]'
>>
>> Is it the expected behaviour? Why does sanitization remove hydrogens?
>>
>> Is it controlled by any of the SanitizeFlags?
>
>
>  It is the expected behavior. When sanitization is turned on, the SMILES
> parser actually calls "RemoveHs"; this removes the hydrogens from the graph
> and then sanitizes the molecule.
>
> If you do not want the Hs removed, you can tell MolFromSmiles to skip the
> sanitization (which also skips the RemoveHs) and then sanitize yourself::
>
> In [3]: m=Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)
>
> In [4]: Chem.SanitizeMol(m)
> Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> In [5]: print Chem.MolToSmiles(m)
> [H]c1sc([H])c([H])c1[H]
>
> I hope this helps,
> -greg
>

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2014-02-24 Thread Greg Landrum
Hi Michal,

On Mon, Feb 24, 2014 at 4:48 PM, Michal Krompiec
wrote:

> Hello, I have just noticed this:
> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]"))
> 'c1ccsc1'
> >>>
> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False))
> '[H]c1sc([H])c([H])c1[H]'
> >>>
> Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)))
> 'c1ccsc1'
> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]"))
> 'c1ccsc1'
> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]",sanitize=False))
> '[H]c1cscc1[H]'
>
> Is it the expected behaviour? Why does sanitization remove hydrogens?

Is it controlled by any of the SanitizeFlags?
>

 It is the expected behavior. When sanitization is turned on, the SMILES
parser actually calls "RemoveHs"; this removes the hydrogens from the graph
and then sanitizes the molecule.

If you do not want the Hs removed, you can tell MolFromSmiles to skip the
sanitization (which also skips the RemoveHs) and then sanitize yourself::

In [3]: m=Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)

In [4]: Chem.SanitizeMol(m)
Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [5]: print Chem.MolToSmiles(m)
[H]c1sc([H])c([H])c1[H]

I hope this helps,
-greg
--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2014-02-24 Thread Christos Kannas
Hi Michal,

It doesn't actually removes them, to be more precise it hides them.
You actually explicit define a hydrogen as ([H]), but if you omit it it
still exists.
You can use Chem.AddHs(...) to add the hydrogens in a molecule and
Chem.RemoveHs(..) to hide them.

Best,
Christos


On 24 February 2014 15:48, Michal Krompiec wrote:

> Hello, I have just noticed this:
> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]"))
> 'c1ccsc1'
> >>>
> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False))
> '[H]c1sc([H])c([H])c1[H]'
> >>>
> Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)))
> 'c1ccsc1'
> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]"))
> 'c1ccsc1'
> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]",sanitize=False))
> '[H]c1cscc1[H]'
>
> Is it the expected behaviour? Why does sanitization remove hydrogens?
> Is it controlled by any of the SanitizeFlags?
>
> Best wishes,
> Michal
>
>
> --
> Flow-based real-time traffic analytics software. Cisco certified tool.
> Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
> Customize your own dashboards, set traffic alerts and generate reports.
> Network behavioral analysis & security monitoring. All-in-one tool.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>



-- 

Christos Kannas
Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]
--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] sanitization removes Hs - is this expected?

2014-02-24 Thread Michal Krompiec
Hello, I have just noticed this:
>>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]"))
'c1ccsc1'
>>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False))
'[H]c1sc([H])c([H])c1[H]'
>>> Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)))
'c1ccsc1'
>>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]"))
'c1ccsc1'
>>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]",sanitize=False))
'[H]c1cscc1[H]'

Is it the expected behaviour? Why does sanitization remove hydrogens?
Is it controlled by any of the SanitizeFlags?

Best wishes,
Michal

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss