[Rdkit-discuss] molecule not draw well

2015-02-02 Thread JP
Just a FYI

The following molecule: Cc1ccc(C[NH+]2C32CC(NC(=S)Nc2c2C)C3)cc1
looks broken when drawn with 2014.09.1 (attached).

Thanks,

-
Jean-Paul Ebejer
Early Stage Researcher
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Modified Mol objects with concurrent.futures

2015-02-02 Thread Christos Kannas
Hi Michael,

The problem occurs because child processes return their results using
pickle, and the ordinary rdkit molecule object when is being pickled it
looses information.
A solution that I use is to convert the molecule objects to PropertyMol
objects, which retain their properties.

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
http://cy.linkedin.com/in/christoskannas

On 2 February 2015 at 09:03, Reutlinger, Michael 
michael.reutlin...@roche.com wrote:

 Hi all,

 I am currently trying to parallelize part of a script using RDKIT and
 concurrent.futures. The function that is executed in parallel returns
 processed molecules as RDKIT Mol objects.

 Without parallelization everything is fine and the Mol objects keep all
 the properties that they had before the processing. When using
 concurrent.futures, the returned molecules lose all properties and seem to
 be created from scratch maybe with unknown side-effects.

 I am wondering if anyone experienced the same issue and knows how to
 circumvent this. I attached a ipython notebook with a small script
 demonstrating the issue.

 Best,
 Michael




 Example Code:

 from concurrent import futures
 from rdkit import Chem
 from rdkit.Chem import AllChem
 from rdkit.Chem.Draw import IPythonConsole

 def process(mol):
 if not Name in mol.GetPropNames():
 print Processing: Name missing
 mol.SetProp(Processed,True)
 return mol

 mol = Chem.MolFromSmiles(N[C@@H](C)C(=O)O)
 mol.SetProp(Name,Alanine)

 with futures.ProcessPoolExecutor(max_workers=1) as pool:
 future = pool.submit(process, mol)
 molOut = future.result()
 if Name not in molOut.GetPropNames():
 print Result: Name missing
 if  Processed not in molOut.GetPropNames():
 print Result: Processed missing




 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Modified Mol objects with concurrent.futures

2015-02-02 Thread Reutlinger, Michael
Hi all,

I am currently trying to parallelize part of a script using RDKIT and
concurrent.futures. The function that is executed in parallel returns
processed molecules as RDKIT Mol objects.

Without parallelization everything is fine and the Mol objects keep all the
properties that they had before the processing. When using
concurrent.futures, the returned molecules lose all properties and seem to
be created from scratch maybe with unknown side-effects.

I am wondering if anyone experienced the same issue and knows how to
circumvent this. I attached a ipython notebook with a small script
demonstrating the issue.

Best,
Michael




Example Code:

from concurrent import futures
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import IPythonConsole

def process(mol):
if not Name in mol.GetPropNames():
print Processing: Name missing
mol.SetProp(Processed,True)
return mol

mol = Chem.MolFromSmiles(N[C@@H](C)C(=O)O)
mol.SetProp(Name,Alanine)

with futures.ProcessPoolExecutor(max_workers=1) as pool:
future = pool.submit(process, mol)
molOut = future.result()
if Name not in molOut.GetPropNames():
print Result: Name missing
if  Processed not in molOut.GetPropNames():
print Result: Processed missing


RDKIT_ParallelProblem (1).ipynb
Description: Binary data
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Modified Mol objects with concurrent.futures

2015-02-02 Thread Reutlinger, Michael
Hi Christos,

thanks for pointing out the pickle issue and the solution using
PropertyMol. After reading the documentation this should definitely solve
the problem.

Best,
Michael

Michael Reutlinger, PhD
Scientist, Molecular Design and Chemical Biology
Roche Pharma Research and Early Development

Roche Innovation Center Basel

F. Hoffmann-La Roche Ltd
Grenzacherstrasse 124
4070 Basel
Switzerland

Phone +41 61 688 87 95
Fax +41 61 688 64 59



*Confidentiality Note:* This message is intended only for the use of the
named recipient(s) and may contain confidential and/or proprietary information.
If you are not the intended recipient, please contact the sender and delete
this message. Any unauthorized use of the information contained in this
message is prohibited.

On Mon, Feb 2, 2015 at 11:17 AM, Christos Kannas chriskan...@gmail.com
wrote:

 Hi Michael,

 The problem occurs because child processes return their results using
 pickle, and the ordinary rdkit molecule object when is being pickled it
 looses information.
 A solution that I use is to convert the molecule objects to PropertyMol
 objects, which retain their properties.

 Best,

 Christos

 Christos Kannas

 Researcher
 Ph.D Student

 Mob (UK): +44 (0) 7447700937
 Mob (Cyprus): +357 99530608

 [image: View Christos Kannas's profile on LinkedIn]
 http://cy.linkedin.com/in/christoskannas

 On 2 February 2015 at 09:03, Reutlinger, Michael 
 michael.reutlin...@roche.com wrote:

 Hi all,

 I am currently trying to parallelize part of a script using RDKIT and
 concurrent.futures. The function that is executed in parallel returns
 processed molecules as RDKIT Mol objects.

 Without parallelization everything is fine and the Mol objects keep all
 the properties that they had before the processing. When using
 concurrent.futures, the returned molecules lose all properties and seem to
 be created from scratch maybe with unknown side-effects.

 I am wondering if anyone experienced the same issue and knows how to
 circumvent this. I attached a ipython notebook with a small script
 demonstrating the issue.

 Best,
 Michael




 Example Code:

 from concurrent import futures
 from rdkit import Chem
 from rdkit.Chem import AllChem
 from rdkit.Chem.Draw import IPythonConsole

 def process(mol):
 if not Name in mol.GetPropNames():
 print Processing: Name missing
 mol.SetProp(Processed,True)
 return mol

 mol = Chem.MolFromSmiles(N[C@@H](C)C(=O)O)
 mol.SetProp(Name,Alanine)

 with futures.ProcessPoolExecutor(max_workers=1) as pool:
 future = pool.submit(process, mol)
 molOut = future.result()
 if Name not in molOut.GetPropNames():
 print Result: Name missing
 if  Processed not in molOut.GetPropNames():
 print Result: Processed missing




 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Replacing H's with F's

2015-02-02 Thread Matthew Lardy
Thanks Peter and Greg!  I had a three atom query to restrict were I was
putting F's, otherwise I would have done as Peter had suggested.  Granted
my path to flush out the duplicates by pushing this out into Java (using
the RDKit Swig bindings) was way more involved than this!  Thanks for the
walkthrough Greg!  It was very helpful!

Thanks again!
Matthew



On Sat, Jan 31, 2015 at 1:58 AM, Greg Landrum greg.land...@gmail.com
wrote:

 For anyone interested in this topic, I just did an RDKit blog post that
 has a somewhat expanded version of this answer:
 http://rdkit.blogspot.com/2015/01/chemical-reaction-notes-i.html

 Best,
 -greg

 On Sat, Jan 31, 2015 at 7:59 AM, Greg Landrum greg.land...@gmail.com
 wrote:

 Hi Matthew,

 On Fri, Jan 30, 2015 at 11:06 PM, Matthew Lardy mla...@gmail.com wrote:


 I am having an issue using the Smarts based Reaction transformations in
 RDKit.  This is a weird transformation, but I wanted to replace any or all
 of the protons on an aromatic ring with an F.

 The original transformation that I tried was:
 c(F)c

 But that didn't work.  So then I tried a couple of other transformations:

 [c:1][c:2][c:3][c:1][c:2]([F])[c:3]

 That failed (as these things generally were failing):
  ps = rxn.RunReactants(mol1)
 Traceback (most recent call last):
   File stdin, line 1, in module
 Boost.Python.ArgumentError: Python argument types in
 ChemicalReaction.RunReactants(ChemicalReaction, Mol)
 did not match C++ signature:
 RunReactants(class RDKit::ChemicalReaction *, class
 boost::python::list)
 RunReactants(class RDKit::ChemicalReaction *, class
 boost::python::tuple)


 The hint to what is going on is in the error message: you called the
 RunReactants method with a Mol (the ChemicalReaction in the argument list
 is the self argument) and it was expecting either a list or a tuple.
 Here's a version that works:

 In [8]: rxn =
 AllChem.ReactionFromSmarts('[c:1][c:2][c:3][c:1][c:2]([F])[c:3]')
 In [9]: m = Chem.MolFromSmiles('c1c1')
 In [10]: ps = rxn.RunReactants((m,))
 In [11]: len(ps)
 Out[11]: 12
 In [12]: Chem.MolToSmiles(ps[0][0])
 Out[12]: 'Fc1c1'

 Note that this still doesn't really do what you want, because it's
 encoded to add an F to an aromatic carbon. Here's an example that shows
 that:

 In [15]: m = Chem.MolFromSmiles('c1ccc(C)cc1')
 In [16]: ps = rxn.RunReactants((m,))
 In [17]: len(ps)
 Out[17]: 12
 In [18]: set([Chem.MolToSmiles(x[0],True) for x in ps])
 Out[18]: {'Cc1(F)c1', 'Cc1ccc(F)cc1', 'Cc1(F)c1', 'Cc1c1F'}

 Note the first product: the F was also added to the carbon with the
 methyl group.

 We can fix that by specifying that the reacting carbon must have an H
 attached:

 In [22]: rxn =
 AllChem.ReactionFromSmarts('[c:1][cH:2][c:3][c:1][c:2]([F])[c:3]')
 In [23]: ps = rxn.RunReactants((m,))
 In [24]: len(ps)
 Out[24]: 10
 In [25]: set([Chem.MolToSmiles(x[0],True) for x in ps])
 Out[25]: {'Cc1ccc(F)cc1', 'Cc1(F)c1', 'Cc1c1F'}

 There's still the question of why so many products are being produced.
 Look at Out[24], why do we get 10 different products?

 The answer is the symmetry in the query describing the reactant.
 Everywhere this query can match, it matches twice - frontwards and
 backwards. So instead of five products, three of which are unique, we get
 ten.

 This can be handled by recognizing that [c:1] and [c:3] are not actually
 involved in the reaction, they are just there to define the environment of
 [c:2]. We can do the same thing with a recursive SMARTS:

 In [30]: rxn = AllChem.ReactionFromSmarts('[cH$(c(c)c):2][c:2][F]')
 In [31]: ps = rxn.RunReactants((m,))
 In [32]: len(ps)
 Out[32]: 5
 In [33]: set([Chem.MolToSmiles(x[0],True) for x in ps])
 Out[33]: {'Cc1ccc(F)cc1', 'Cc1(F)c1', 'Cc1c1F'}

 Hope this helps,
 -greg




 Then I got desperate:

 [#6:1][#6:2]([#1])[#6:3].[H][#9:4][#6:1][#6:2]([#9:4])[#6:3]

 Any mention of an explicit H caused issues, so then I dropped it and
 re-ran things again.

 No luck.  I should mention that I am using the pre-built python RDKit
 wrappers for windows, and if I use the java wrappers on linux I get
 different errors but the same outcome.

 I should add, that the molecule that I read (and the molecule for HF)
 were both loaded without issue.

 Anyone else try to do something like this?

 Matthew


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take
 a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss