Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-17 Thread Guillaume GODIN
+1 for Symmetrizer too, A must!


De : Michal Krompiec
Date : jeudi, 18 janvier 2018 à 08:18
À : Jason Biggs
Cc : RDKit Discuss, Greg Landrum
Objet : Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

+1 vote for Symmetrizer. It would be very useful for preparing input for 
computational chemistry codes.
Best,
Michal Krompiec
Merck KGaA

On Mon, 15 Jan 2018 at 15:21, Jason Biggs 
> wrote:

  *   I've had this on my to-do list for a few months now, implementing the 
algorithm described in this paper.  I think the force-field energy minimization 
routines already present in the RDKit can be utilized for this pretty easily.  
The only part that I don't think is set up already would be applying a constant 
force to all atoms to force them into the xy plane.

Frączek, T., "Simulation-Based Algorithm for Two-Dimensional Chemical Structure 
Diagram Generation of Complex Molecules and Ligand–Protein Interactions." J. 
Chem. Inf. Model. 2016, 56, 2320-2335, DOI: 10.1021/acs.jcim.6b00391.


  *   Another idea would be to add in point-group symmetry detection.  I'm 
using the Symmetrizer java library, described here 
https://www.ncbi.nlm.nih.gov/pubmed/22549414, and pretty happy with it overall. 
 One could re-implement it in C++, or include the jar in the External folder 
and write python wrappers.

Jason Biggs


On Mon, Jan 15, 2018 at 1:09 AM, Greg Landrum 
> wrote:
Dear all,

We've been invited again to participate in the OpenChemistry application for 
Google Summer of Code.

In order to participate we need ideas for projects and mentors to go along with 
them.

The current list of RDKit ideas is being maintained here:
http://wiki.openchemistry.org/GSoC_Ideas_2018#RDKit_Project_Ideas

(Note: at the point that I'm pressing "send", that's still a copy of last 
year's project ideas).

If you're willing to be a mentor (please ask me about the ~5 hours/week 
required here) or have ideas, please reply to this thread.

Best,
-greg

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

***
DISCLAIMER  
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their official business.  
***
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-17 Thread Michal Krompiec
+1 vote for Symmetrizer. It would be very useful for preparing input for
computational chemistry codes.
Best,
Michal Krompiec
Merck KGaA

On Mon, 15 Jan 2018 at 15:21, Jason Biggs  wrote:

>
>- I've had this on my to-do list for a few months now, implementing
>the algorithm described in this paper.  I think the force-field energy
>minimization routines already present in the RDKit can be utilized for this
>pretty easily.  The only part that I don't think is set up already would be
>applying a constant force to all atoms to force them into the xy plane.
>
> Frączek, T., "Simulation-Based Algorithm for Two-Dimensional Chemical
> Structure Diagram Generation of Complex Molecules and Ligand–Protein
> Interactions." J. Chem. Inf. Model. 2016, 56, 2320-2335, DOI:
> 10.1021/acs.jcim.6b00391.
>
>
>
>- Another idea would be to add in point-group symmetry detection.  I'm
>using the Symmetrizer java library, described here
>https://www.ncbi.nlm.nih.gov/pubmed/22549414, and pretty happy with it
>overall.  One could re-implement it in C++, or include the jar in the
>External folder and write python wrappers.
>
>
> Jason Biggs
>
>
> On Mon, Jan 15, 2018 at 1:09 AM, Greg Landrum 
> wrote:
>
>> Dear all,
>>
>> We've been invited again to participate in the OpenChemistry application
>> for Google Summer of Code.
>>
>> In order to participate we need ideas for projects and mentors to go
>> along with them.
>>
>> The current list of RDKit ideas is being maintained here:
>> http://wiki.openchemistry.org/GSoC_Ideas_2018#RDKit_Project_Ideas
>>
>> (Note: at the point that I'm pressing "send", that's still a copy of last
>> year's project ideas).
>>
>> If you're willing to be a mentor (please ask me about the ~5 hours/week
>> required here) or have ideas, please reply to this thread.
>>
>> Best,
>> -greg
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Exhaustive Library Enumeration

2018-01-17 Thread Christos Kannas
Hi Andy,

A better option is to sanitize the products of a reaction enumeration
before using them as reactants.
Look at this example from RDKit "Getting Started" documentation.

Note that the molecules that are produced by the chemical reaction
processing code are not sanitized, as this artificial reaction demonstrates:

>>> rxn = 
>>> AllChem.ReactionFromSmarts('[C:1]=[C:2][C:3]=[C:4].[C:5]=[C:6]>>[C:1]1=[C:2][C:3]=[C:4][C:5]=[C:6]1')>>>
>>>  ps = rxn.RunReactants((Chem.MolFromSmiles('C=CC=C'), 
>>> Chem.MolFromSmiles('C=C')))>>> Chem.MolToSmiles(ps[0][0])'C1=CC=CC=C1'>>> 
>>> p0 = ps[0][0]>>> 
>>> Chem.SanitizeMol(p0)rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE>>> 
>>> Chem.MolToSmiles(p0)'c1c1'

​PS: ​I forgot that the results of a reaction enumeration were not
sanitised, until I so the error in the command line.

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]


On 18 January 2018 at 00:07, Andy Jennings 
wrote:

> Hi Christos,
>
> Many thanks for the reply. I hadn't appreciated that the presence of a
> single invalid reagent would bring the entire thing crashing down, rather
> than issuing a warning/error and moving onto other molecules in the set.
> Good to know, and I'll have to be less lazy in my code ;-)
>
> Best,
> Andy
>
> On Wed, Jan 17, 2018 at 1:56 PM, Christos Kannas 
> wrote:
>
>> Hi Andy,
>>
>> The reason that your code breaks is that the second product of the third
>> iteration ( 'NN(Cc1c1)(Cc1c1)Cc1c1') is not a valid
>> molecule.
>> And when calling Chem.MolFromSmiles( 'NN(Cc1c1)(Cc1c1)Cc1
>> c1') it creates a None object.
>> So you have to filter out the molecules that are not valid.
>>
>> See this Jupyter Notebook
>>  at
>> cell 5 the 1st line in the while loop.
>>
>> Best,
>>
>> Christos
>>
>> Christos Kannas
>>
>> Chem[o]informatics Researcher & Software Developer
>>
>> [image: View Christos Kannas's profile on LinkedIn]
>> 
>>
>> On 17 January 2018 at 18:16, Andy Jennings 
>> wrote:
>>
>>> Hi RDKitters,
>>>
>>> I have a question and an observation on the topic of library enumeration.
>>>
>>> First, the question: is there a call within RDKit to trigger the
>>> exhaustive reaction of reagents? For example, if I have two reagents - a
>>> primary amine and an akyl chloride - can I tell RDKit to enumerate the
>>> reaction as though there were an excess of each reagent? In my case here
>>> the reaction would continue until the alkylation can no longer occur
>>> because there are no more valences available on the amine and I would
>>> either be tri-alkylated for a neutral product or quat-alkylated for a
>>> positively charged product
>>> e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R
>>>
>>> This brings me to my observation. When I try to attempt exactly this by
>>> repeatedly exposing the product to the reagent again I am able to drive it
>>> to exhaustion *in some cases*.
>>>
>>> For example, in the example above where RCl is benzyl chloride and my
>>> smirks is:
>>> [#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]'
>>> I do drive the final product to be exclusively the tri-akylated amine.
>>> Success.
>>>
>>> However, when I attempt the same thing with an amine with more than one
>>> reactive nitrogen (e.g. NN) I don't get a single product with 6
>>> alkylations, I get two unique product each with three alkylations. One
>>> product has two alkylations on the first nitrogen and one on the second,
>>> the other product has three alkylations on the first nitrogen and none on
>>> the second. Attempting to drive the reaction once again leads to a
>>> 'reaction called with None reactants' ValueError. My dreadful code is below
>>> and the output is
>>> Reaction 1: ['NNCc1c1']
>>> Reaction 2: ['NN(Cc1c1)Cc1c1', 'c1ccc(CNNCc2c2)cc1']
>>> Reaction 3: ['c1ccc(CNN(Cc2c2)Cc2c2)cc1',
>>> 'NN(Cc1c1)(Cc1c1)Cc1c1']
>>> Reaction 4: ValueError
>>>
>>> Any pointers would be great, as would any pre-existing library
>>> enumeration code. The examples I've found shipped with RDKit don't appear
>>> to allow me to name the products using a combination of the reagent names
>>> (useful for tracking library content).
>>>
>>> Best,
>>> Andy
>>>
>>>  Code snippet 
>>>
>>> amine = Chem.MolFromSmiles('NN')
>>> acyl = Chem.MolFromSmiles('c1c1CCl')
>>> rxn = AllChem.ReactionFromSmarts('[#7:1].[#6:2][Cl:3]>>[#6:2][#7:1
>>> ].[Cl:3]')
>>>
>>> # First reaction
>>> reactantListMols = [amine,acyl]
>>> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,r
>>> eactantListMols])
>>> prods = list(prods)
>>> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
>>> prods]))
>>> 

Re: [Rdkit-discuss] Exhaustive Library Enumeration

2018-01-17 Thread Andy Jennings
Hi Christos,

Many thanks for the reply. I hadn't appreciated that the presence of a
single invalid reagent would bring the entire thing crashing down, rather
than issuing a warning/error and moving onto other molecules in the set.
Good to know, and I'll have to be less lazy in my code ;-)

Best,
Andy

On Wed, Jan 17, 2018 at 1:56 PM, Christos Kannas 
wrote:

> Hi Andy,
>
> The reason that your code breaks is that the second product of the third
> iteration ( 'NN(Cc1c1)(Cc1c1)Cc1c1') is not a valid
> molecule.
> And when calling Chem.MolFromSmiles( 'NN(Cc1c1)(Cc1c1)Cc
> 1c1') it creates a None object.
> So you have to filter out the molecules that are not valid.
>
> See this Jupyter Notebook
>  at
> cell 5 the 1st line in the while loop.
>
> Best,
>
> Christos
>
> Christos Kannas
>
> Chem[o]informatics Researcher & Software Developer
>
> [image: View Christos Kannas's profile on LinkedIn]
> 
>
> On 17 January 2018 at 18:16, Andy Jennings 
> wrote:
>
>> Hi RDKitters,
>>
>> I have a question and an observation on the topic of library enumeration.
>>
>> First, the question: is there a call within RDKit to trigger the
>> exhaustive reaction of reagents? For example, if I have two reagents - a
>> primary amine and an akyl chloride - can I tell RDKit to enumerate the
>> reaction as though there were an excess of each reagent? In my case here
>> the reaction would continue until the alkylation can no longer occur
>> because there are no more valences available on the amine and I would
>> either be tri-alkylated for a neutral product or quat-alkylated for a
>> positively charged product
>> e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R
>>
>> This brings me to my observation. When I try to attempt exactly this by
>> repeatedly exposing the product to the reagent again I am able to drive it
>> to exhaustion *in some cases*.
>>
>> For example, in the example above where RCl is benzyl chloride and my
>> smirks is:
>> [#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]'
>> I do drive the final product to be exclusively the tri-akylated amine.
>> Success.
>>
>> However, when I attempt the same thing with an amine with more than one
>> reactive nitrogen (e.g. NN) I don't get a single product with 6
>> alkylations, I get two unique product each with three alkylations. One
>> product has two alkylations on the first nitrogen and one on the second,
>> the other product has three alkylations on the first nitrogen and none on
>> the second. Attempting to drive the reaction once again leads to a
>> 'reaction called with None reactants' ValueError. My dreadful code is below
>> and the output is
>> Reaction 1: ['NNCc1c1']
>> Reaction 2: ['NN(Cc1c1)Cc1c1', 'c1ccc(CNNCc2c2)cc1']
>> Reaction 3: ['c1ccc(CNN(Cc2c2)Cc2c2)cc1',
>> 'NN(Cc1c1)(Cc1c1)Cc1c1']
>> Reaction 4: ValueError
>>
>> Any pointers would be great, as would any pre-existing library
>> enumeration code. The examples I've found shipped with RDKit don't appear
>> to allow me to name the products using a combination of the reagent names
>> (useful for tracking library content).
>>
>> Best,
>> Andy
>>
>>  Code snippet 
>>
>> amine = Chem.MolFromSmiles('NN')
>> acyl = Chem.MolFromSmiles('c1c1CCl')
>> rxn = AllChem.ReactionFromSmarts('[#7:1].[#6:2][Cl:3]>>[#6:2][#7:1
>> ].[Cl:3]')
>>
>> # First reaction
>> reactantListMols = [amine,acyl]
>> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,r
>> eactantListMols])
>> prods = list(prods)
>> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
>> prods]))
>> print smis
>> # ['NNCc1c1']
>>
>> # Now repeat until doom
>> for i in range(0,10):
>> oldproducts = [Chem.MolFromSmiles(x) for x in smis]
>> reactantListMols = oldproducts + [acyl]
>> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,r
>> eactantListMols])
>> prods = list(prods)
>> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
>> prods]))
>> print smis
>>
>>  End Code 
>>
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

Re: [Rdkit-discuss] Exhaustive Library Enumeration

2018-01-17 Thread Christos Kannas
Hi Andy,

The reason that your code breaks is that the second product of the third
iteration ( 'NN(Cc1c1)(Cc1c1)Cc1c1') is not a valid
molecule.
And when calling Chem.MolFromSmiles( 'NN(Cc1c1)(Cc1c1)Cc1c1')
it creates a None object.
So you have to filter out the molecules that are not valid.

See this Jupyter Notebook
 at cell
5 the 1st line in the while loop.

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]


On 17 January 2018 at 18:16, Andy Jennings 
wrote:

> Hi RDKitters,
>
> I have a question and an observation on the topic of library enumeration.
>
> First, the question: is there a call within RDKit to trigger the
> exhaustive reaction of reagents? For example, if I have two reagents - a
> primary amine and an akyl chloride - can I tell RDKit to enumerate the
> reaction as though there were an excess of each reagent? In my case here
> the reaction would continue until the alkylation can no longer occur
> because there are no more valences available on the amine and I would
> either be tri-alkylated for a neutral product or quat-alkylated for a
> positively charged product
> e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R
>
> This brings me to my observation. When I try to attempt exactly this by
> repeatedly exposing the product to the reagent again I am able to drive it
> to exhaustion *in some cases*.
>
> For example, in the example above where RCl is benzyl chloride and my
> smirks is:
> [#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]'
> I do drive the final product to be exclusively the tri-akylated amine.
> Success.
>
> However, when I attempt the same thing with an amine with more than one
> reactive nitrogen (e.g. NN) I don't get a single product with 6
> alkylations, I get two unique product each with three alkylations. One
> product has two alkylations on the first nitrogen and one on the second,
> the other product has three alkylations on the first nitrogen and none on
> the second. Attempting to drive the reaction once again leads to a
> 'reaction called with None reactants' ValueError. My dreadful code is below
> and the output is
> Reaction 1: ['NNCc1c1']
> Reaction 2: ['NN(Cc1c1)Cc1c1', 'c1ccc(CNNCc2c2)cc1']
> Reaction 3: ['c1ccc(CNN(Cc2c2)Cc2c2)cc1',
> 'NN(Cc1c1)(Cc1c1)Cc1c1']
> Reaction 4: ValueError
>
> Any pointers would be great, as would any pre-existing library enumeration
> code. The examples I've found shipped with RDKit don't appear to allow me
> to name the products using a combination of the reagent names (useful for
> tracking library content).
>
> Best,
> Andy
>
>  Code snippet 
>
> amine = Chem.MolFromSmiles('NN')
> acyl = Chem.MolFromSmiles('c1c1CCl')
> rxn = AllChem.ReactionFromSmarts('[#7:1].[#6:2][Cl:3]>>[#6:2][#7:
> 1].[Cl:3]')
>
> # First reaction
> reactantListMols = [amine,acyl]
> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,
> reactantListMols])
> prods = list(prods)
> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
> prods]))
> print smis
> # ['NNCc1c1']
>
> # Now repeat until doom
> for i in range(0,10):
> oldproducts = [Chem.MolFromSmiles(x) for x in smis]
> reactantListMols = oldproducts + [acyl]
> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,
> reactantListMols])
> prods = list(prods)
> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
> prods]))
> print smis
>
>  End Code 
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] edge matrix

2018-01-17 Thread Chris Earnshaw
I don't think there's a way to do this using RDKit itself, but it appears
to be straightforward using Python with numpy and networkx, e.g.

import numpy as np
import networkx as nx
a = np.matrix([[0, 1, 0, 0, 0],[1, 0, 1, 1, 0],[0, 1, 0, 0, 0],[0, 1, 0, 0,
1],[0, 0, 0, 1, 0]])
b = nx.from_numpy_matrix(a)
lg = nx.line_graph(b)
ea = nx.adjacency_matrix(lg)
ea

matrix([[ 0.,  1.,  1.,  0.],
[ 1.,  0.,  1.,  0.],
[ 1.,  1.,  0.,  1.],
[ 0.,  0.,  1.,  0.]])

Hope this helps - but I'm way out of my depth here!

Best regards,
Chris


On 17 January 2018 at 16:57, Mario Lovrić  wrote:

> Correct, I am looking for a rdkit-hidden-option to do it :D
>
> On Wed, Jan 17, 2018 at 5:56 PM, Jason Biggs 
> wrote:
>
>> I am a novice when it comes to graph theory, but it seems like what is
>> wanted here is the adjacency matrix of the corresponding line graph (
>> http://mathworld.wolfram.com/LineGraph.html).
>>
>> I don't know how to do this in python, but if I use mathematica, it goes
>> like this
>>
>> adjacencyMatrix = {{0, 1, 0, 0, 0}, {1, 0, 1, 1, 0}, {0, 1, 0, 0,
>> 0}, {0, 1, 0, 0, 1}, {0, 0, 0, 1, 0}};
>>
>> graph = AdjacencyGraph[adjacencyMatrix];
>> lineGraph = LineGraph[graph];
>> AdjacencyMatrix[lineGraph] // MatrixForm
>>
>> [image: Inline image 1]
>>
>>
>> Jason Biggs
>>
>>
>> On Wed, Jan 17, 2018 at 10:21 AM, Marta Stępniewska-Dziubińska via
>> Rdkit-discuss  wrote:
>>
>>> Hi Mario,
>>>
>>> What exactly do you mean by 'edge matrix'? Are you sure you provided a
>>> correct example? If you want to get an adjacency matrix of a molecular
>>> graph you can iterate over bonds to get it:
>>>
>>> from rdkit.Chem import MolFromSmiles
>>> import numpy as np
>>> m = MolFromSmiles('CC(C)CC')
>>> n = m.GetNumAtoms()
>>> E = np.zeros((n, n))
>>> for b in m.GetBonds():
>>> i = b.GetBeginAtomIdx()
>>> j = b.GetEndAtomIdx()
>>> E[[i,j], [j,i]] = 1
>>>
>>>
>>> Hope this helps,
>>> Marta SD
>>>
>>>
>>>
>>> 2018-01-17 16:31 GMT+01:00 Mario Lovrić :
>>>
 Dear all,

 Does any one have an idea how to get an edge matrix (graph theory) out
 of Rdkit, I digged deep but didnt find anything.

 F.example for:

 'CC(C)CC'


 it would be:

 array([[0, 1, 1, 0],
[1, 0, 1, 0],
[1, 1, 0, 1],
[0, 0, 1, 0]])

 Thanks.


 --
 Mario Lovrić

 
 --
 Check out the vibrant tech community on one of the world's most
 engaging tech sites, Slashdot.org! http://sdm.link/slashdot
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>
>
> --
> Mario Lovrić
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] edge matrix

2018-01-17 Thread Mario Lovrić
Correct, I am looking for a rdkit-hidden-option to do it :D

On Wed, Jan 17, 2018 at 5:56 PM, Jason Biggs  wrote:

> I am a novice when it comes to graph theory, but it seems like what is
> wanted here is the adjacency matrix of the corresponding line graph (
> http://mathworld.wolfram.com/LineGraph.html).
>
> I don't know how to do this in python, but if I use mathematica, it goes
> like this
>
> adjacencyMatrix = {{0, 1, 0, 0, 0}, {1, 0, 1, 1, 0}, {0, 1, 0, 0,
> 0}, {0, 1, 0, 0, 1}, {0, 0, 0, 1, 0}};
>
> graph = AdjacencyGraph[adjacencyMatrix];
> lineGraph = LineGraph[graph];
> AdjacencyMatrix[lineGraph] // MatrixForm
>
> [image: Inline image 1]
>
>
> Jason Biggs
>
>
> On Wed, Jan 17, 2018 at 10:21 AM, Marta Stępniewska-Dziubińska via
> Rdkit-discuss  wrote:
>
>> Hi Mario,
>>
>> What exactly do you mean by 'edge matrix'? Are you sure you provided a
>> correct example? If you want to get an adjacency matrix of a molecular
>> graph you can iterate over bonds to get it:
>>
>> from rdkit.Chem import MolFromSmiles
>> import numpy as np
>> m = MolFromSmiles('CC(C)CC')
>> n = m.GetNumAtoms()
>> E = np.zeros((n, n))
>> for b in m.GetBonds():
>> i = b.GetBeginAtomIdx()
>> j = b.GetEndAtomIdx()
>> E[[i,j], [j,i]] = 1
>>
>>
>> Hope this helps,
>> Marta SD
>>
>>
>>
>> 2018-01-17 16:31 GMT+01:00 Mario Lovrić :
>>
>>> Dear all,
>>>
>>> Does any one have an idea how to get an edge matrix (graph theory) out
>>> of Rdkit, I digged deep but didnt find anything.
>>>
>>> F.example for:
>>>
>>> 'CC(C)CC'
>>>
>>>
>>> it would be:
>>>
>>> array([[0, 1, 1, 0],
>>>[1, 0, 1, 0],
>>>[1, 1, 0, 1],
>>>[0, 0, 1, 0]])
>>>
>>> Thanks.
>>>
>>>
>>> --
>>> Mario Lovrić
>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>


-- 
Mario Lovrić
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] edge matrix

2018-01-17 Thread Jason Biggs
I am a novice when it comes to graph theory, but it seems like what is
wanted here is the adjacency matrix of the corresponding line graph (
http://mathworld.wolfram.com/LineGraph.html).

I don't know how to do this in python, but if I use mathematica, it goes
like this

adjacencyMatrix = {{0, 1, 0, 0, 0}, {1, 0, 1, 1, 0}, {0, 1, 0, 0,
0}, {0, 1, 0, 0, 1}, {0, 0, 0, 1, 0}};

graph = AdjacencyGraph[adjacencyMatrix];
lineGraph = LineGraph[graph];
AdjacencyMatrix[lineGraph] // MatrixForm

[image: Inline image 1]


Jason Biggs


On Wed, Jan 17, 2018 at 10:21 AM, Marta Stępniewska-Dziubińska via
Rdkit-discuss  wrote:

> Hi Mario,
>
> What exactly do you mean by 'edge matrix'? Are you sure you provided a
> correct example? If you want to get an adjacency matrix of a molecular
> graph you can iterate over bonds to get it:
>
> from rdkit.Chem import MolFromSmiles
> import numpy as np
> m = MolFromSmiles('CC(C)CC')
> n = m.GetNumAtoms()
> E = np.zeros((n, n))
> for b in m.GetBonds():
> i = b.GetBeginAtomIdx()
> j = b.GetEndAtomIdx()
> E[[i,j], [j,i]] = 1
>
>
> Hope this helps,
> Marta SD
>
>
>
> 2018-01-17 16:31 GMT+01:00 Mario Lovrić :
>
>> Dear all,
>>
>> Does any one have an idea how to get an edge matrix (graph theory) out of
>> Rdkit, I digged deep but didnt find anything.
>>
>> F.example for:
>>
>> 'CC(C)CC'
>>
>>
>> it would be:
>>
>> array([[0, 1, 1, 0],
>>[1, 0, 1, 0],
>>[1, 1, 0, 1],
>>[0, 0, 1, 0]])
>>
>> Thanks.
>>
>>
>> --
>> Mario Lovrić
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] edge matrix

2018-01-17 Thread Mario Lovrić
Dear Marta and Guillaume,

Thank you both.
Your solutions are giving the same output, which is the vertices-adjacency
matrix.
There is something called the edge-adjaceny matrix. Its defined in several
papers by Trinajstic to calculate the M2 Zagreb indices, eg. "The Zagreb
Indices 30 Years After"
The matrix I wrote is manually written, it returns adjacent bonds instead
of atoms.


KR


On Wed, Jan 17, 2018 at 5:21 PM, Marta Stępniewska-Dziubińska <
mart...@ibb.waw.pl> wrote:

> Hi Mario,
>
> What exactly do you mean by 'edge matrix'? Are you sure you provided a
> correct example? If you want to get an adjacency matrix of a molecular
> graph you can iterate over bonds to get it:
>
> from rdkit.Chem import MolFromSmiles
> import numpy as np
> m = MolFromSmiles('CC(C)CC')
> n = m.GetNumAtoms()
> E = np.zeros((n, n))
> for b in m.GetBonds():
> i = b.GetBeginAtomIdx()
> j = b.GetEndAtomIdx()
> E[[i,j], [j,i]] = 1
>
>
> Hope this helps,
> Marta SD
>
>
>
> 2018-01-17 16:31 GMT+01:00 Mario Lovrić :
>
>> Dear all,
>>
>> Does any one have an idea how to get an edge matrix (graph theory) out of
>> Rdkit, I digged deep but didnt find anything.
>>
>> F.example for:
>>
>> 'CC(C)CC'
>>
>>
>> it would be:
>>
>> array([[0, 1, 1, 0],
>>[1, 0, 1, 0],
>>[1, 1, 0, 1],
>>[0, 0, 1, 0]])
>>
>> Thanks.
>>
>>
>> --
>> Mario Lovrić
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>


-- 
Mario Lovrić
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] mol file parsing, 3D or 2D

2018-01-17 Thread Dimitri Maziuk

On 2018-01-17 10:25, Jason Biggs wrote:

For the case in question, I find that if I read in a mol file containing 
2D coordinates, and I skip the sanitization step altogether, then the 3D 
embedding algorithms fail.


Well, yes, as I mentioned in the other thread: the only way you can get 
it to work reliably is if you start with 3D coordinates to begin with. 
Otherwise your users have to get in there every once in a while and 
decide which way to slice that cake they don't get to eat. ;)


Dima

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] edge matrix

2018-01-17 Thread Marta Stępniewska-Dziubińska via Rdkit-discuss
Hi Mario,

What exactly do you mean by 'edge matrix'? Are you sure you provided a
correct example? If you want to get an adjacency matrix of a molecular
graph you can iterate over bonds to get it:

from rdkit.Chem import MolFromSmiles
import numpy as np
m = MolFromSmiles('CC(C)CC')
n = m.GetNumAtoms()
E = np.zeros((n, n))
for b in m.GetBonds():
i = b.GetBeginAtomIdx()
j = b.GetEndAtomIdx()
E[[i,j], [j,i]] = 1


Hope this helps,
Marta SD



2018-01-17 16:31 GMT+01:00 Mario Lovrić :

> Dear all,
>
> Does any one have an idea how to get an edge matrix (graph theory) out of
> Rdkit, I digged deep but didnt find anything.
>
> F.example for:
>
> 'CC(C)CC'
>
>
> it would be:
>
> array([[0, 1, 1, 0],
>[1, 0, 1, 0],
>[1, 1, 0, 1],
>[0, 0, 1, 0]])
>
> Thanks.
>
>
> --
> Mario Lovrić
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] edge matrix

2018-01-17 Thread Guillaume GODIN
Dear Mario,

There is a adjacency matrix available:

from rdkit import Chem
mol = Chem.MolFromSmiles('CC(C)CC')
adj = Chem.GetAdjacencyMatrix(mol)
print adj


[[0 1 0 0 0]

 [1 0 1 1 0]

 [0 1 0 0 0]

 [0 1 0 0 1]

 [0 0 0 1 0]]

But this is not what you want…

Can you explain your output generation process please ?

BR,

Guillaume


De : Mario Lovrić
Date : mercredi, 17 janvier 2018 à 16:31
À : RDKit Discuss
Objet : [Rdkit-discuss] edge matrix

Dear all,

Does any one have an idea how to get an edge matrix (graph theory) out of 
Rdkit, I digged deep but didnt find anything.

F.example for:

'CC(C)CC'


it would be:

array([[0, 1, 1, 0],
   [1, 0, 1, 0],
   [1, 1, 0, 1],
   [0, 0, 1, 0]])

Thanks.


--
Mario Lovrić

***
DISCLAIMER  
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their official business.  
***
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] ConstrainedEmbed and issues with overlapping Hs

2018-01-17 Thread Susan Leung
Dear all,

I am using Constrained embed to generate conformers. I AddHs before I use 
ConstrainedEmbed but I am finding that some of the conformers have Hs which 
overlap (have the same coordinates).

Here is one example below.

from rdkit import Chem
from rdkit.Chem import AllChem, rdFMCS
from rdkit.Chem.Draw import IPythonConsole

m = Chem.MolFromMolBlock("""
RDKit  3D

 23 25  0  0  0  0  0  0  0  0999 V2000
   68.3300   51.1910   11.2220 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.2290   50.03807.4130 N   0  0  0  0  0  0  0  0  0  0  0  0
   69.1000   46.91306.3200 O   0  0  0  0  0  0  0  0  0  0  0  0
   68.1740   52.1510   10.2210 C   0  0  0  0  0  0  0  0  0  0  0  0
   66.6510   48.51605.8250 N   0  0  0  0  0  0  0  0  0  0  0  0
   67.8070   51.74508.9400 C   0  0  0  0  0  0  0  0  0  0  0  0
   70.2560   46.54108.2130 N   0  0  0  0  0  0  0  0  0  0  0  0
   68.1130   49.8320   10.9440 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.7430   49.44209.6500 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.5850   50.40508.6490 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.0030   48.75007.0910 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.1430   47.71008.0530 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.5210   48.10009.3470 C   0  0  0  0  0  0  0  0  0  0  0  0
   66.9080   46.17207.8280 C   0  0  0  0  0  0  0  0  0  0  0  0
   68.1970   45.30107.7860 C   0  0  0  0  0  0  0  0  0  0  0  0
   69.2630   46.27507.3670 C   0  0  0  0  0  0  0  0  0  0  0  0
   70.3640   45.78209.4850 C   0  0  0  0  0  0  0  0  0  0  0  0
   71.0660   47.69207.7870 C   0  0  1  0  0  0  0  0  0  0  0  0
   71.1300   48.79108.8520 C   0  0  0  0  0  0  0  0  0  0  0  0
   71.3550   50.11408.1030 C   0  0  0  0  0  0  0  0  0  0  0  0
   72.4840   50.01207.0390 C   0  0  0  0  0  0  0  0  0  0  0  0
   72.6270   48.65806.2850 C   0  0  0  0  0  0  0  0  0  0  0  0
   72.4290   47.43507.1920 C   0  0  0  0  0  0  0  0  0  0  0  0
  4  1  2  0
  6  4  1  0
  8  1  1  0
  9  8  2  0
 10  9  1  0
 10  6  2  0
 10  2  1  0
 11  5  1  0
 11  2  2  0
 12 11  1  0
 13 12  2  0
 13  9  1  0
 14 12  1  0
 15 14  1  0
 16 15  1  0
 16  7  1  0
 16  3  2  0
 17  7  1  0
 18  7  1  1
 19 18  1  0
 20 19  1  0
 21 20  1  0
 22 21  1  0
 23 22  1  0
 23 18  1  0
M  END
""")

sm = "CN(C(=O)CCc1cc2c2nc1N)C1C1CC"
res = rdFMCS.FindMCS([m, Chem.MolFromSmiles(sm)], completeRingsOnly=True, 
ringMatchesRingOnly=True, matchValences=True)
core1 = AllChem.DeleteSubstructs(AllChem.ReplaceSidechains(m, 
Chem.MolFromSmarts(res.smartsString)),
 Chem.MolFromSmiles('*'))

confs = []
sm_H_mol = Chem.AddHs(Chem.MolFromSmiles(sm))
for i in xrange(10):
sm_H_conf = AllChem.ConstrainedEmbed(sm_H_mol, core1, randomseed=i)
confs.append(sm_H_conf)

print Chem.MolToMolBlock(confs[1])

>


 RDKit  3D

 54 56  0  0  0  0  0  0  0  0999 V2000
   70.3134   45.71789.5447 C   0  0  0  0  0  0  0  0  0  0  0  0
   70.3467   46.49468.2897 N   0  0  0  0  0  0  0  0  0  0  0  0
   69.2899   46.28207.3349 C   0  0  0  0  0  0  0  0  0  0  0  0
   69.2440   46.95036.2711 O   0  0  0  0  0  0  0  0  0  0  0  0
   68.1131   45.39087.6445 C   0  0  0  0  0  0  0  0  0  0  0  0
   66.8303   46.22737.8443 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.0755   47.71868.0943 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.4608   48.10619.3845 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.7033   49.44919.6830 C   0  0  0  0  0  0  0  0  0  0  0  0
   68.0967   49.8458   10.9679 C   0  0  0  0  0  0  0  0  0  0  0  0
   68.3350   51.1969   11.2372 C   0  0  0  0  0  0  0  0  0  0  0  0
   68.1872   52.1531   10.2278 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.8127   51.76038.9409 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.5741   50.41158.6657 C   0  0  0  0  0  0  0  0  0  0  0  0
   67.2187   50.02807.4186 N   0  0  0  0  0  0  0  0  0  0  0  0
   66.9721   48.73227.0969 C   0  0  0  0  0  0  0  0  0  0  0  0
   66.6333   48.45985.7410 N   0  0  0  0  0  0  0  0  0  0  0  0
   71.4151   47.51608.1630 C   0  0  0  0  0  0  0  0  0  0  0  0
   71.1074   48.86598.8420 C   0  0  0  0  0  0  0  0  0  0  0  0
   71.2788   50.08947.9509 C   0  0  0  0  0  0  0  0  0  0  0  0
   72.5203   49.99117.0632 C   0  0  0  0  0  0  0  0  0  0  0  0
   72.4817   48.73076.1876 C   0  0  0  0  0  0  0  0  0  0  0  0
   72.4676   47.44877.0184 C   0  0  0  0  0  0  0  0  0  0  0  0
   72.4079   46.19606.1098 C   0  0  0  0  0  0  0  0  0  0  0  0
   72.8476   44.92146.8308 C   0  0  0  0  0  0  0  0  0  0  0  0
   69.3674   45.9082   10.0888 H   0  0  0  0  0  0  0  0  0  0  0  0
   71.1204   45.9748   10.2614 H   0  0  0  0  0  0  0  0  0  0  0  0
   70.4079   44.63419.3233 H   0  0  0  0  0  0  0  0  0  0  0  0
   68.2620   44.71768.5091 H   0  0  0  0  0  0  0  0  

Re: [Rdkit-discuss] mol file parsing, 3D or 2D

2018-01-17 Thread Jason Biggs
On Wed, Jan 17, 2018 at 10:12 AM, Dimitri Maziuk 
wrote:

> On 2018-01-16 22:46, Greg Landrum wrote:
>
> It might be worth thinking about adding an option to the aromaticity
>> perception code to maintain the original bond types and just set the
>> "isAromatic" flag on the bonds.
>>
>
> This is how it's modeled in mmCIF chem. comp. It may or may not come from
> openeye they were using originally to process their ligands/chem comps.
>
> From programming perspective it's pretty annoying since you have to
> remember to add an extra if stanza to all your code, queries, etc.
>
> What's wrong with keeping a copy of the original molecule around? -- I'm
> not sure I get the "I want to sanitize and keep the original bonds too", it
> sounds too much like the proverbial cake.
>

To the extent possible, I do want to allow users to have and eat the cake
:-).

For the case in question, I find that if I read in a mol file containing 2D
coordinates, and I skip the sanitization step altogether, then the 3D
embedding algorithms fail.


>
> Dima
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Exhaustive Library Enumeration

2018-01-17 Thread Andy Jennings
Hi RDKitters,

I have a question and an observation on the topic of library enumeration.

First, the question: is there a call within RDKit to trigger the exhaustive
reaction of reagents? For example, if I have two reagents - a primary amine
and an akyl chloride - can I tell RDKit to enumerate the reaction as though
there were an excess of each reagent? In my case here the reaction would
continue until the alkylation can no longer occur because there are no more
valences available on the amine and I would either be tri-alkylated for a
neutral product or quat-alkylated for a positively charged product
e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R

This brings me to my observation. When I try to attempt exactly this by
repeatedly exposing the product to the reagent again I am able to drive it
to exhaustion *in some cases*.

For example, in the example above where RCl is benzyl chloride and my
smirks is:
[#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]'
I do drive the final product to be exclusively the tri-akylated amine.
Success.

However, when I attempt the same thing with an amine with more than one
reactive nitrogen (e.g. NN) I don't get a single product with 6
alkylations, I get two unique product each with three alkylations. One
product has two alkylations on the first nitrogen and one on the second,
the other product has three alkylations on the first nitrogen and none on
the second. Attempting to drive the reaction once again leads to a
'reaction called with None reactants' ValueError. My dreadful code is below
and the output is
Reaction 1: ['NNCc1c1']
Reaction 2: ['NN(Cc1c1)Cc1c1', 'c1ccc(CNNCc2c2)cc1']
Reaction 3: ['c1ccc(CNN(Cc2c2)Cc2c2)cc1',
'NN(Cc1c1)(Cc1c1)Cc1c1']
Reaction 4: ValueError

Any pointers would be great, as would any pre-existing library enumeration
code. The examples I've found shipped with RDKit don't appear to allow me
to name the products using a combination of the reagent names (useful for
tracking library content).

Best,
Andy

 Code snippet 

amine = Chem.MolFromSmiles('NN')
acyl = Chem.MolFromSmiles('c1c1CCl')
rxn = AllChem.ReactionFromSmarts('[#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]')

# First reaction
reactantListMols = [amine,acyl]
prods =
AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,reactantListMols])
prods = list(prods)
smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
prods]))
print smis
# ['NNCc1c1']

# Now repeat until doom
for i in range(0,10):
oldproducts = [Chem.MolFromSmiles(x) for x in smis]
reactantListMols = oldproducts + [acyl]
prods =
AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,reactantListMols])
prods = list(prods)
smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
prods]))
print smis

 End Code 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] mol file parsing, 3D or 2D

2018-01-17 Thread Dimitri Maziuk

On 2018-01-16 22:46, Greg Landrum wrote:

It might be worth thinking about adding an option to the aromaticity 
perception code to maintain the original bond types and just set the 
"isAromatic" flag on the bonds.


This is how it's modeled in mmCIF chem. comp. It may or may not come 
from openeye they were using originally to process their ligands/chem comps.


From programming perspective it's pretty annoying since you have to 
remember to add an extra if stanza to all your code, queries, etc.


What's wrong with keeping a copy of the original molecule around? -- I'm 
not sure I get the "I want to sanitize and keep the original bonds too", 
it sounds too much like the proverbial cake.


Dima

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] edge matrix

2018-01-17 Thread Mario Lovrić
Dear all,

Does any one have an idea how to get an edge matrix (graph theory) out of
Rdkit, I digged deep but didnt find anything.

F.example for:

'CC(C)CC'


it would be:

array([[0, 1, 1, 0],
   [1, 0, 1, 0],
   [1, 1, 0, 1],
   [0, 0, 1, 0]])

Thanks.


-- 
Mario Lovrić
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss