Re: [Rdkit-discuss] Conformer generation

2020-07-23 Thread Chuang, Kangway
Hi David,

Thanks for the clarification. One more computationally-expensive and naive 
approach would be to generate multiple conformers and check them against a 
known reference structure.

Here's a quick take on the approach that will be a bit slower. It assumes that 
mol is a probe molecule that has no conformers, ref_mol is a mol that has the 
desired geometry, and that you already know what the core scaffold is:


mol= 
Chem.MolFromSmiles('[17OH]C1=C2C=C(N)C=C1CC3=C([17O-])C(CC4=C([17OH])C(CC5=C(N)C(C2)=CC(S(=O)([O-])=O)=C5)=CC(S(=O)([O-])=O)=C4)=CC(N)=C3')
add_conformer_match(mol, ref_mol, 
'C1(C2)=CC(CC3=CC(CC4=CC=CC(CC5=CC=CC2=C5)=C4)=CC=C3)=CC=C1')


def add_conformer_match(probe_mol, # assume no conformers yet
reference_mol, # assume the reference has a single 
correct conf
substructure, # the conserved scaffold
threshold=0.5, # define an rmsd threshold
num_confs=50,
num_threads=4):
# create a consistent scaffold to match atom_ids
scaffold = Chem.MolFromSmiles(substructure)
probe_indices = probe_mol.GetSubstructMatch(scaffold)
ref_indices = reference_mol.GetSubstructMatch(scaffold)
atom_map = list(zip(probe_indices, ref_indices))

# generate diverse conformers for new molecule
probe_mol_h =  Chem.AddHs(probe_mol)
conf_ids = AllChem.EmbedMultipleConfs(probe_mol_h,
  numConfs=num_confs,
  numThreads=num_threads,
  pruneRmsThresh=0.5)
probe_mol_confs = Chem.RemoveHs(probe_mol_h)

# iterate over conformers to see if they match
rms_list = []
for conf_id in conf_ids:
rmsd = AllChem.AlignMol(probe_mol_confs,
   reference_mol,
   prbCid=conf_id,
   atomMap=atom_map)
rms_list.append(rmsd)

# check to see if you find a reasonable match
if min(rms_list) > threshold:
print(f'No conformer found within RMS threshold of {threshold}')
return None

else:
# add the lowest_rms conformer to the original mol
lowest_rms = rms_list.index(min(rms_list))
probe_mol.AddConformer(probe_mol_confs.GetConformer(lowest_rms))
return probe_mol # return the original object with added conformer


Kangway

From: David Turnbull 
Sent: Thursday, July 23, 2020 7:58 AM
To: rdkit-discuss@lists.sourceforge.net ; 
Chuang, Kangway 
Subject: Re: Conformer generation

That is the structure I want, however I found that it doesn't give that 
structure every time (sometimes it inverts the rings).

Get Outlook for 
Android<https://urldefense.proofpoint.com/v2/url?u=https-3A__aka.ms_ghei36=DwMGaQ=iORugZls2LlYyCAZRB3XLg=Z0E5F87lf3GPcsIl1f2OYQw4iwqHJfffu3dwlNgH2Zs=zKUYd_Jfhr2WYR_OBhn5whmIq1pRnUMMWKZXet-DsRg=_KGfUMbYuqPwR3m2g7VO79jxh9F-XP_TRtO9itOLZ40=>


From: Chuang, Kangway 
Sent: Thursday, July 23, 2020 8:55:01 AM
To: David Turnbull ; 
rdkit-discuss@lists.sourceforge.net 
Subject: Re: Conformer generation

[△EXTERNAL]


Hi David,

Do you have a specific example of the bowl conformation you're looking for 
(e.g. do you have an image of the desired conformation vs what you are seeing)?

Running your current code I get the following conformer (shown in two different 
views).

Kangway


From: David Turnbull 
Sent: Thursday, July 23, 2020 6:49 AM
To: rdkit-discuss@lists.sourceforge.net 
Subject: [Rdkit-discuss] Conformer generation


Hi all,



I am trying to generate structures of calixarenes in a set shape, I am trying 
to use constrain distances but struggling. The lowest energy conformer is not 
what I want as I want it in the bowl shape. I labelled 4 oxygens with O17 (for 
finding purposes) and set the distance of them as that should set the geometry 
but it isn’t working. Any help would be great. Code below.



mol= 
Chem.MolFromSmiles('[17OH]C1=C2C=C([Y])C=C1CC3=C([17O-])C(CC4=C([17OH])C(CC5=C([17OH])C(C2)=CC(S(=O)([O-])=O)=C5)=CC(S(=O)([O-])=O)=C4)=CC([Y])=C3')

y= Chem.MolFromSmiles('OC1=CC=C([Y])C=C1')

rxn = AllChem.ReactionFromSmarts("[Y][*:1].[Y][*:2]>>[*:1][*:2]")

results=rxn.RunReactants([mol,y])

for products in results:

for mol in products:

m2=mol

results2=rxn.RunReactants([m2,y])

for products in results2:

for mol in products:

m3=mol

m4=Chem.MolToSmiles(m3)

x=Chem.MolFromSmiles(m4)

index={}

k=0

for atom in x.GetAtoms():

if atom.GetSymbol() == 'O':

if atom.GetIsotope() == 17:

index[k]=atom.GetIdx()

k+=1

m5=Chem.AddHs(x)

AllChem.EmbedMolecule(m5, useRandomCoords=True)

ff=AllChem.UFFGetMoleculeForceField(m5)

ff.UFFAddDistanceConstraint( index[0], index[

Re: [Rdkit-discuss] Conformer generation

2020-07-23 Thread David Turnbull
That is the structure I want, however I found that it doesn't give that 
structure every time (sometimes it inverts the rings).

Get Outlook for Android<https://aka.ms/ghei36>


From: Chuang, Kangway 
Sent: Thursday, July 23, 2020 8:55:01 AM
To: David Turnbull ; 
rdkit-discuss@lists.sourceforge.net 
Subject: Re: Conformer generation

[△EXTERNAL]


Hi David,

Do you have a specific example of the bowl conformation you're looking for 
(e.g. do you have an image of the desired conformation vs what you are seeing)?

Running your current code I get the following conformer (shown in two different 
views).

Kangway


From: David Turnbull 
Sent: Thursday, July 23, 2020 6:49 AM
To: rdkit-discuss@lists.sourceforge.net 
Subject: [Rdkit-discuss] Conformer generation


Hi all,



I am trying to generate structures of calixarenes in a set shape, I am trying 
to use constrain distances but struggling. The lowest energy conformer is not 
what I want as I want it in the bowl shape. I labelled 4 oxygens with O17 (for 
finding purposes) and set the distance of them as that should set the geometry 
but it isn’t working. Any help would be great. Code below.



mol= 
Chem.MolFromSmiles('[17OH]C1=C2C=C([Y])C=C1CC3=C([17O-])C(CC4=C([17OH])C(CC5=C([17OH])C(C2)=CC(S(=O)([O-])=O)=C5)=CC(S(=O)([O-])=O)=C4)=CC([Y])=C3')

y= Chem.MolFromSmiles('OC1=CC=C([Y])C=C1')

rxn = AllChem.ReactionFromSmarts("[Y][*:1].[Y][*:2]>>[*:1][*:2]")

results=rxn.RunReactants([mol,y])

for products in results:

for mol in products:

m2=mol

results2=rxn.RunReactants([m2,y])

for products in results2:

for mol in products:

m3=mol

m4=Chem.MolToSmiles(m3)

x=Chem.MolFromSmiles(m4)

index={}

k=0

for atom in x.GetAtoms():

if atom.GetSymbol() == 'O':

if atom.GetIsotope() == 17:

index[k]=atom.GetIdx()

k+=1

m5=Chem.AddHs(x)

AllChem.EmbedMolecule(m5, useRandomCoords=True)

ff=AllChem.UFFGetMoleculeForceField(m5)

ff.UFFAddDistanceConstraint( index[0], index[1], False, 3.5, 4.5, 99.0)

ff.UFFAddDistanceConstraint( index[0], index[2], False, 3.5, 4.5, 99.0)

ff.UFFAddDistanceConstraint( index[0], index[3], False, 3.5, 4.5, 99.0)

m6=ff.Minimize(maxIts=50)



David
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Conformer generation

2020-07-23 Thread David Turnbull
Hi all,

I am trying to generate structures of calixarenes in a set shape, I am trying 
to use constrain distances but struggling. The lowest energy conformer is not 
what I want as I want it in the bowl shape. I labelled 4 oxygens with O17 (for 
finding purposes) and set the distance of them as that should set the geometry 
but it isn’t working. Any help would be great. Code below.

mol= 
Chem.MolFromSmiles('[17OH]C1=C2C=C([Y])C=C1CC3=C([17O-])C(CC4=C([17OH])C(CC5=C([17OH])C(C2)=CC(S(=O)([O-])=O)=C5)=CC(S(=O)([O-])=O)=C4)=CC([Y])=C3')
y= Chem.MolFromSmiles('OC1=CC=C([Y])C=C1')
rxn = AllChem.ReactionFromSmarts("[Y][*:1].[Y][*:2]>>[*:1][*:2]")
results=rxn.RunReactants([mol,y])
for products in results:
for mol in products:
m2=mol
results2=rxn.RunReactants([m2,y])
for products in results2:
for mol in products:
m3=mol
m4=Chem.MolToSmiles(m3)
x=Chem.MolFromSmiles(m4)
index={}
k=0
for atom in x.GetAtoms():
if atom.GetSymbol() == 'O':
if atom.GetIsotope() == 17:
index[k]=atom.GetIdx()
k+=1
m5=Chem.AddHs(x)
AllChem.EmbedMolecule(m5, useRandomCoords=True)
ff=AllChem.UFFGetMoleculeForceField(m5)
ff.UFFAddDistanceConstraint( index[0], index[1], False, 3.5, 4.5, 99.0)
ff.UFFAddDistanceConstraint( index[0], index[2], False, 3.5, 4.5, 99.0)
ff.UFFAddDistanceConstraint( index[0], index[3], False, 3.5, 4.5, 99.0)
m6=ff.Minimize(maxIts=50)

David
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation with torsion restraints/frozen atoms

2019-04-03 Thread Jan Halborg Jensen
Dear Angelica

Here are a couple of codes that may be of interest. They don’t do exactly what 
you want, but maybe they can give you some ideas.

https://github.com/jensengroup/get_conformations
https://github.com/jensengroup/TS_conf_search

Best regards, Jan

On 3 Apr 2019, at 00.58, Angelica Parente 
mailto:apare...@alumni.stanford.edu>> wrote:

Hi,

I’d like to generate a set of conformers with restraints on some of the 
substructures. I’d like to keep one segment of the molecule frozen, allowing 
the rest of the molecule to be mobile. Within the part of the molecule that is 
mobile, I’d like to restrict the torsion angles for one of the substructures.

How can I go about doing this? I’d also like to make sure I’m getting 
exhaustive sampling, and I’m not sure how long this would take or how many 
conformers I’d need to generate considering this is a fairly large molecule.

Thanks,

Angelica

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Conformer generation with torsion restraints/frozen atoms

2019-04-02 Thread Angelica Parente
Hi,

I’d like to generate a set of conformers with restraints on some of the 
substructures. I’d like to keep one segment of the molecule frozen, allowing 
the rest of the molecule to be mobile. Within the part of the molecule that is 
mobile, I’d like to restrict the torsion angles for one of the substructures. 

 How can I go about doing this? I’d also like to make sure I’m getting 
exhaustive sampling, and I’m not sure how long this would take or how many 
conformers I’d need to generate considering this is a fairly large molecule. 

Thanks,

Angelica 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation

2017-10-26 Thread Greg Landrum
I was about to reply "yeah, the first ID is the cluster centroid", but that
may not be true for all definitions of "centroid".

According to the Butina paper, the first point added is considered to be
the centroid. The definition of that is that all the other points in the
cluster are within the exclusion distance of the first point. It does not
necessarily mean that its the point in the cluster that would satisfy a
geometric definition of centroid.

But that's a detail. According to the definitions of the algorithm, the
first point is the centroid. :-)

-greg



On Thu, Oct 26, 2017 at 4:51 PM, Paul Hawkins <phawk...@eyesopen.com> wrote:

> Thanks Greg. In the results from the Butina clustering is the cluster
> centre the first ID in the list? If so then to recover the cluster centres
> I’d just need to do something like:
>
>
>
> For c in cs:
>
> conf_id_I_want = c[0]
>
>
>
>
>
> Paul.
>
>
>
> *From:* Greg Landrum [mailto:greg.land...@gmail.com]
> *Sent:* Thursday, October 26, 2017 12:41 AM
> *To:* Paul Hawkins <phawk...@eyesopen.com>
> *Cc:* rdkit-discuss@lists.sourceforge.net; Sereina <
> sereina.rini...@gmail.com>
> *Subject:* Re: [Rdkit-discuss] Conformer generation
>
>
>
>
>
>
>
> On Wed, Oct 25, 2017 at 6:52 PM, Sereina <sereina.rini...@gmail.com>
> wrote:
>
> Hi Paul,
>
>
>
> Regarding your second question:
>
>
>
> On 25 Oct 2017, at 18:36, Paul Hawkins <phawk...@eyesopen.com> wrote:
>
>
>
> Also, once I generate the conformers what is best way to cluster them by
> RMSD so that each conformer has a minimum RMSD to all the others in the set?
>
>
>
> I think the function AllChem.GetConformerRMSMatrix() might do (parts of)
> what you want.
>
>
>
> And since we're doing the "each reply refines the answer" thing, here's a
> bit of code that does Butina clustering to group the conformers:
>
>
>
> In [71]: m = Chem.AddHs(Chem.MolFromSmiles('OCCc1n1'))
>
>
>
> In [72]: AllChem.EmbedMultipleConfs(m,50,AllChem.ETKDG())
>
> Out[72]: 
>
>
>
> In [73]: dm = AllChem.GetConformerRMSMatrix(m)
>
>
>
> In [76]: cs = Butina.ClusterData(dm,m.GetNumConformers(),1.5,isDistData=
> True,reordering=True)
>
>
>
> In [77]: len(cs)
>
> Out[77]: 9
>
>
>
> In [78]: for c in cs:
>
> ...: print(c)
>
> ...:
>
> (36, 2, 3, 4, 7, 8, 9, 11, 14, 15, 16, 17, 21, 29, 30, 33, 34, 40, 43, 44,
> 46, 47)
>
> (10, 0, 32, 35, 38, 45, 49, 18, 19, 22, 23, 25, 27)
>
> (48, 6, 39, 42, 12, 24, 26)
>
> (5, 41, 31)
>
> (37,)
>
> (28,)
>
> (20,)
>
> (13,)
>
> (1,)
>
>
>
> The 1.5 argument in the call to Butina.ClusterData is the distance
> threshold for things to be considered neighbors.
>
>
>
> -greg
>
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation

2017-10-26 Thread Paul Hawkins
Thanks Greg. In the results from the Butina clustering is the cluster centre 
the first ID in the list? If so then to recover the cluster centres I’d just 
need to do something like:

For c in cs:
conf_id_I_want = c[0]


Paul.

From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: Thursday, October 26, 2017 12:41 AM
To: Paul Hawkins <phawk...@eyesopen.com>
Cc: rdkit-discuss@lists.sourceforge.net; Sereina <sereina.rini...@gmail.com>
Subject: Re: [Rdkit-discuss] Conformer generation



On Wed, Oct 25, 2017 at 6:52 PM, Sereina 
<sereina.rini...@gmail.com<mailto:sereina.rini...@gmail.com>> wrote:
Hi Paul,

Regarding your second question:

On 25 Oct 2017, at 18:36, Paul Hawkins 
<phawk...@eyesopen.com<mailto:phawk...@eyesopen.com>> wrote:

Also, once I generate the conformers what is best way to cluster them by RMSD 
so that each conformer has a minimum RMSD to all the others in the set?

I think the function AllChem.GetConformerRMSMatrix() might do (parts of) what 
you want.

And since we're doing the "each reply refines the answer" thing, here's a bit 
of code that does Butina clustering to group the conformers:

In [71]: m = Chem.AddHs(Chem.MolFromSmiles('OCCc1n1'))

In [72]: AllChem.EmbedMultipleConfs(m,50,AllChem.ETKDG())
Out[72]: 

In [73]: dm = AllChem.GetConformerRMSMatrix(m)

In [76]: cs = 
Butina.ClusterData(dm,m.GetNumConformers(),1.5,isDistData=True,reordering=True)

In [77]: len(cs)
Out[77]: 9

In [78]: for c in cs:
...: print(c)
...:
(36, 2, 3, 4, 7, 8, 9, 11, 14, 15, 16, 17, 21, 29, 30, 33, 34, 40, 43, 44, 46, 
47)
(10, 0, 32, 35, 38, 45, 49, 18, 19, 22, 23, 25, 27)
(48, 6, 39, 42, 12, 24, 26)
(5, 41, 31)
(37,)
(28,)
(20,)
(13,)
(1,)

The 1.5 argument in the call to Butina.ClusterData is the distance threshold 
for things to be considered neighbors.

-greg

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation

2017-10-26 Thread Greg Landrum
On Wed, Oct 25, 2017 at 6:52 PM, Sereina  wrote:

> Hi Paul,
>
> Regarding your second question:
>
> On 25 Oct 2017, at 18:36, Paul Hawkins  wrote:
>
> Also, once I generate the conformers what is best way to cluster them by
> RMSD so that each conformer has a minimum RMSD to all the others in the set?
>
>
> I think the function AllChem.GetConformerRMSMatrix() might do (parts of)
> what you want.
>

And since we're doing the "each reply refines the answer" thing, here's a
bit of code that does Butina clustering to group the conformers:

In [71]: m = Chem.AddHs(Chem.MolFromSmiles('OCCc1n1'))

In [72]: AllChem.EmbedMultipleConfs(m,50,AllChem.ETKDG())
Out[72]: 

In [73]: dm = AllChem.GetConformerRMSMatrix(m)

In [76]: cs =
Butina.ClusterData(dm,m.GetNumConformers(),1.5,isDistData=True,reordering=True)

In [77]: len(cs)
Out[77]: 9

In [78]: for c in cs:
...: print(c)
...:
(36, 2, 3, 4, 7, 8, 9, 11, 14, 15, 16, 17, 21, 29, 30, 33, 34, 40, 43, 44,
46, 47)
(10, 0, 32, 35, 38, 45, 49, 18, 19, 22, 23, 25, 27)
(48, 6, 39, 42, 12, 24, 26)
(5, 41, 31)
(37,)
(28,)
(20,)
(13,)
(1,)


The 1.5 argument in the call to Butina.ClusterData is the distance
threshold for things to be considered neighbors.

-greg
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation

2017-10-25 Thread Sereina
Hi Paul,

Regarding your second question:

> On 25 Oct 2017, at 18:36, Paul Hawkins  wrote:
> 
> Also, once I generate the conformers what is best way to cluster them by RMSD 
> so that each conformer has a minimum RMSD to all the others in the set?

I think the function AllChem.GetConformerRMSMatrix() might do (parts of) what 
you want.

Best,
Sereina--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation

2017-10-25 Thread David Hall
Hi Paul,

Your reuse of the variable num_confs inside the loop is causing that
monotonic decrease. So, if a molecule returns 190 conformers, the next
iteration has you only asking for 190 conformers. And so on.

Best,
David



On Wed, Oct 25, 2017 at 12:36 PM, Paul Hawkins 
wrote:

> Hello,
>
>
> I have run into a problem with using the RDKit to generate conformers of
> molecules. I am using the following code:
>
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
> from timeit import default_timer as timer
>
>
>
> def GenerateDGConfs(m,num_confs,rms):
> start_time = timer()
> ids = AllChem.EmbedMultipleConfs(m, numConfs=num_confs,
> pruneRmsThresh=rms, maxAttempts=200,enforceChirality=True)
> for id in ids:
> AllChem.MMFFOptimizeMolecule(m, confId=id)
>
> end_time = timer()
> time_diff = end_time - start_time
> #print ("Normal DG = %0.2f" % time_diff)
>
> return m, list(ids), time_diff
>
>
> w = Chem.SDWriter("%s/%s" % (rootdir,"My_conformers.sdf))
>
>
> suppl = Chem.SDMolSupplier("%s/%s" % (rootdir,"My_molecules.sdf"))
> num_confs = 200
> rmsd = 0.5
>
>
> for mol in suppl:
>
> if mol is None: continue
>
> Chem.AssignAtomChiralTagsFromStructure(mol)
> mol1 = Chem.AddHs(mol)
>
> conf_mol, id_list, time_diff = GenerateDGConfs(mol1,num_confs,rmsd)
> num_confs = conf_mol.GetNumConformers()
> for id in id_list:
> w.write(conf_mol, confId=id)
>
>
>
> w.flush()
> w.close()
>
>
> What I see from this is as I go through the molecules in the input file
> the number of conformers returned declines monotonically, starting close to
> the 200 I set as a maximum to around 10 after a few thousand molecules have
> been processed (this applies whether I use 'normal' DG or the ETKDG method.
> As I am a new user of RDKit I am sure I missed something obvious but I
> cannot see it.
>
> Also, once I generate the conformers what is best way to cluster them by
> RMSD so that each conformer has a minimum RMSD to all the others in the set?
>
>
> Any help would be gratefully received.
>
>
>
> Paul.
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation using ETKDG

2017-10-04 Thread Greg Landrum
Hi Jordan,

On Thu, Oct 5, 2017 at 6:48 AM, Jordan McCone  wrote:

>
> I have a .smi file which has a number of smiles strings in it. I would
> like to generate a single 3D conformer using ETKDG for every smiles string
> in the list,
>

If you google around, you will find a number of emails, posts, scripts, and
even RDKit "getting started" and cookbook entries for generating
conformations. Reading in a .smi file is covered in the Getting Started
guide, as well as other places.


> and then get the output as a single .mol2 file.
>

The RDKit does not create .mol2 files, so I'm afraid you're out of luck
there. Any chance that you can provide another format?

-greg
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Conformer generation using ETKDG

2017-10-04 Thread Jordan McCone
Hi all,

I have a .smi file which has a number of smiles strings in it. I would like
to generate a single 3D conformer using ETKDG for every smiles string in
the list, and then get the output as a single .mol2 file.

I am very new to rdkit and even python, and as much as i'd to do this
myself, I need to have these ligands ready asap for a virtual screening
campaign we are about to start.

Does anyone have a solution to this?

Thanks in advance
Jordan
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation does not sample well?

2016-06-29 Thread Steven Kearnes
These papers may be relevant:

   - http://pubs.acs.org/doi/full/10.1021/ci2004658
   - https://jcheminf.springeropen.com/articles/10.1186/1758-2946-3-4


On Wed, Jun 29, 2016 at 1:29 AM, Sereina  wrote:

> Hi Tim,
>
> I had a look at the 1DWD example and I detail in the following the
> analysis I did as it may be useful to other users.
>
> The conformer generation function has the option to print the experimental
> torsion preferences that were used in the generation:
> printExpTorsionAngles=True
>
> For 1DWD, it gives the following output:
>
> > ref = Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@
> @H](NC(=O)CNS(=O)(=O)c2ccc3c3c2)C(=O)N2C2)cc1’)
> > mol1 =
> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb')
> > mol1 = AllChem.AssignBondOrdersFromTemplate(ref, mol1)
> > mol1 = Chem.AddHs(mol1)
> > numConfs = 100
> > cids = AllChem.EmbedMultipleConfs(mol1, numConfs=numConfs,
> useExpTorsionAnglePrefs=True, useBasicKnowledge=True,
> printExpTorsionAngles=True)
> O=[S:1](=O)[NX3H1:2]!@;-[CX4H2:3][!#1:4]: 0 13 14 15, (17.9, -13.3, 9.2,
> 4.7, -2.3, -0.9)
> O=[C:1][NX3H1:2]!@;-[CX4H1:3][H:4]: 15 17 18 48, (5, 0, 0, 0, 0, 0)
> [O:1]=[CX3:2]!@;-[NX3H0:3][!#1:4]: 20 19 31 32, (0, 8, 0, 0, 0, 0)
> [O:1]=[CX3:2]!@;-[NX3H1:3][!#1:4]: 16 15 17 18, (100, 0, 0, 0, 0, 0)
> [c:1][S:2](=O)(=O)!@;-[NX3H1:3][C:4]: 4 0 13 14, (0, 16, 5, 7, 0, 0)
> [aH1:1][c:2]([aH1])!@;-[SX4:3][!#1:4]: 3 4 0 13, (0, 1.5, 0, 0, 0, 0)
> N[C:2](=[O:1])!@;-[CH2:3][N:4]: 16 15 14 13, (0, 10, 0, 0, 0, 0)
> [!#1:1][CX4:2]!@;-[CX4:3][!#1:4]: 17 18 21 22, (0, 0, 7, 0, 0, 0)
> [NH1:1][CX4:2]!@;-[CX3:3]=[O:4]: 17 18 19 20, (0, 2.1, -0.1, -1, -0.4,
> -0.1)
> [cH1:1][c:2]([cH1])!@;-[CX4H2:3][CX4:4]: 23 22 21 18, (0, 3.6, 0, 0, 0, 0)
> [a:1][c:2]!@;-[C:3](=[N:4]): 25 27 28 30, (0, 5, 0, 0, 0, 0)
> [*:1][X3,X2:2]=[X3,X2:3][*:4]: 27 28 30 57, (0, 100, 0, 0, 0, 0)
>
> The output is structured as follows: First comes the SMARTS pattern, then
> the indices of the four atoms involved and finally the six force constants
> K_1, K_2, K_3, K_4, K_5, K_6 for the torsion potential (see Eq. (2) in
> JCIM, 55, 2562, 2015).
> The position of the non-zero force constant tells you the multiplicity i
> of the cosine. E.g. for the torsion between atoms 18 and 21 the third force
> constant is 7.0 and all others are zero, thus the torsion potential for
> this bond has a multiplicity i = 3 (i.e. three maxima). The information
> about the phase shift is not exposed, but we can change that if people find
> it useful.
> Currently to get the full information about the torsion potential, you
> need to search for the SMARTS pattern in
> Code/GraphMol/ForceFieldHelpers/CrystalFF/TorsionPreferences.cpp
> There you find the line: "[!#1:1][CX4:2]!@;-[CX4:3][!#1:4] 1 0.0 1 0.0 1
> 7.0 1 0.0 1 0.0 1 0.0\n"
> Here we have twelve parameters, always two for a given multiplicity: first
> the phase shift (can be 1 or -1) and second the force constant.
> For this particular torsion pattern we have a multiplicity of 3, a phase
> shift of cos(d) = 1 and a force constant K_3 = 7.0, which corresponds to
> three peaks at 60°, 180° and 300° in the range [0°, 360°] (or -60°, 60° and
> 180° in the range [-180°, 180°]).
>
> The two bonds connecting the benzamidine ring to the rest of the molecule
> are between atoms 22/21 and 21/18.
> I calculated the signed dihedral angle for these two bonds, here is the
> code for the second one:
>
> > angles = []
> > for cid in cids:
> >conf = mol1.GetConformer(id=cid)
> ># torsion of atoms 17 18 21 22
> >p1 = conf.GetAtomPosition(17)
> >p2 = conf.GetAtomPosition(18)
> >p3 = conf.GetAtomPosition(21)
> >p4 = conf.GetAtomPosition(22)
> >a = Geometry.ComputeSignedDihedralAngle(p1, p2, p3, p4)/math.pi *
> 180.0
> >angles.append(a)
>
> For 100 confs, this gives the following distribution:
> So, the torsion potential is sampled as it is supposed to do.
>
>
> For the other bond between atmos 22/21, we have the following line in
> TorsionPreferences.cpp:
> "[cH1:1][c:2]([cH1])!@;-[CX4H2:3][CX4:4] 1 0.0 1 3.6 1 0.0 1 0.0 1 0.0 1
> 0.0\n”
> This means, we have a multiplicity of 2, a phase shift cos(d) = 1 and a
> force constant of 3.6, which corresponds to two peaks at -120° and 120° in
> the range [-180°, 180°].
> The histogram looks like:
> Again, the potential is largely sampled as it’s supposed to be.
>
> Now, let’s have a look at the two bonds connecting atom 18 to the other
> two branches.
> For the bond between atoms 18/17, we should have a single peak at 0°,
> which we get most of the time
>
> The bond between atoms 18/19 has a multi-term potential, which is a bit
> more difficult to interpret from the numbers alone.
> "[NH1:1][CX4:2]!@;-[CX3:3]=[O:4] 1 0.0 -1 2.1 -1 -0.1 -1 -1.0 1 -0.4 1
> -0.1\n"
> It has three peaks at -150°, 0° and 150° in the range [-180°, 180°]. The
> peak at 0° is broad.
> The histogram looks like:
> For the 100 conformers, the peaks at -150° and 150° 

Re: [Rdkit-discuss] Conformer generation does not sample well?

2016-06-29 Thread Sereina
Hi Tim,

I had a look at the 1DWD example and I detail in the following the analysis I 
did as it may be useful to other users.

The conformer generation function has the option to print the experimental 
torsion preferences that were used in the generation:
printExpTorsionAngles=True

For 1DWD, it gives the following output:

> ref = 
> Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@@H](NC(=O)CNS(=O)(=O)c2ccc3c3c2)C(=O)N2C2)cc1’)
> mol1 = 
> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb')
> mol1 = AllChem.AssignBondOrdersFromTemplate(ref, mol1)
> mol1 = Chem.AddHs(mol1)
> numConfs = 100
> cids = AllChem.EmbedMultipleConfs(mol1, numConfs=numConfs, 
> useExpTorsionAnglePrefs=True, useBasicKnowledge=True, 
> printExpTorsionAngles=True)
O=[S:1](=O)[NX3H1:2]!@;-[CX4H2:3][!#1:4]: 0 13 14 15, (17.9, -13.3, 9.2, 4.7, 
-2.3, -0.9) 
O=[C:1][NX3H1:2]!@;-[CX4H1:3][H:4]: 15 17 18 48, (5, 0, 0, 0, 0, 0) 
[O:1]=[CX3:2]!@;-[NX3H0:3][!#1:4]: 20 19 31 32, (0, 8, 0, 0, 0, 0) 
[O:1]=[CX3:2]!@;-[NX3H1:3][!#1:4]: 16 15 17 18, (100, 0, 0, 0, 0, 0) 
[c:1][S:2](=O)(=O)!@;-[NX3H1:3][C:4]: 4 0 13 14, (0, 16, 5, 7, 0, 0) 
[aH1:1][c:2]([aH1])!@;-[SX4:3][!#1:4]: 3 4 0 13, (0, 1.5, 0, 0, 0, 0) 
N[C:2](=[O:1])!@;-[CH2:3][N:4]: 16 15 14 13, (0, 10, 0, 0, 0, 0) 
[!#1:1][CX4:2]!@;-[CX4:3][!#1:4]: 17 18 21 22, (0, 0, 7, 0, 0, 0) 
[NH1:1][CX4:2]!@;-[CX3:3]=[O:4]: 17 18 19 20, (0, 2.1, -0.1, -1, -0.4, -0.1) 
[cH1:1][c:2]([cH1])!@;-[CX4H2:3][CX4:4]: 23 22 21 18, (0, 3.6, 0, 0, 0, 0) 
[a:1][c:2]!@;-[C:3](=[N:4]): 25 27 28 30, (0, 5, 0, 0, 0, 0) 
[*:1][X3,X2:2]=[X3,X2:3][*:4]: 27 28 30 57, (0, 100, 0, 0, 0, 0)

The output is structured as follows: First comes the SMARTS pattern, then the 
indices of the four atoms involved and finally the six force constants K_1, 
K_2, K_3, K_4, K_5, K_6 for the torsion potential (see Eq. (2) in JCIM, 55, 
2562, 2015).
The position of the non-zero force constant tells you the multiplicity i of the 
cosine. E.g. for the torsion between atoms 18 and 21 the third force constant 
is 7.0 and all others are zero, thus the torsion potential for this bond has a 
multiplicity i = 3 (i.e. three maxima). The information about the phase shift 
is not exposed, but we can change that if people find it useful. 
Currently to get the full information about the torsion potential, you need to 
search for the SMARTS pattern in 
Code/GraphMol/ForceFieldHelpers/CrystalFF/TorsionPreferences.cpp
There you find the line: "[!#1:1][CX4:2]!@;-[CX4:3][!#1:4] 1 0.0 1 0.0 1 7.0 1 
0.0 1 0.0 1 0.0\n"
Here we have twelve parameters, always two for a given multiplicity: first the 
phase shift (can be 1 or -1) and second the force constant.
For this particular torsion pattern we have a multiplicity of 3, a phase shift 
of cos(d) = 1 and a force constant K_3 = 7.0, which corresponds to three peaks 
at 60°, 180° and 300° in the range [0°, 360°] (or -60°, 60° and 180° in the 
range [-180°, 180°]).

The two bonds connecting the benzamidine ring to the rest of the molecule are 
between atoms 22/21 and 21/18.
I calculated the signed dihedral angle for these two bonds, here is the code 
for the second one:

> angles = []
> for cid in cids:
>conf = mol1.GetConformer(id=cid) 
># torsion of atoms 17 18 21 22
>p1 = conf.GetAtomPosition(17)
>p2 = conf.GetAtomPosition(18)
>p3 = conf.GetAtomPosition(21)
>p4 = conf.GetAtomPosition(22)
>a = Geometry.ComputeSignedDihedralAngle(p1, p2, p3, p4)/math.pi * 180.0
>angles.append(a)

For 100 confs, this gives the following distribution:

So, the torsion potential is sampled as it is supposed to do. 


For the other bond between atmos 22/21, we have the following line in 
TorsionPreferences.cpp:
"[cH1:1][c:2]([cH1])!@;-[CX4H2:3][CX4:4] 1 0.0 1 3.6 1 0.0 1 0.0 1 0.0 1 0.0\n”
This means, we have a multiplicity of 2, a phase shift cos(d) = 1 and a force 
constant of 3.6, which corresponds to two peaks at -120° and 120° in the range 
[-180°, 180°].
The histogram looks like:

Again, the potential is largely sampled as it’s supposed to be.

Now, let’s have a look at the two bonds connecting atom 18 to the other two 
branches.
For the bond between atoms 18/17, we should have a single peak at 0°, which we 
get most of the time


The bond between atoms 18/19 has a multi-term potential, which is a bit more 
difficult to interpret from the numbers alone.
"[NH1:1][CX4:2]!@;-[CX3:3]=[O:4] 1 0.0 -1 2.1 -1 -0.1 -1 -1.0 1 -0.4 1 -0.1\n"
It has three peaks at -150°, 0° and 150° in the range [-180°, 180°]. The peak 
at 0° is broad.

The histogram looks like:

For the 100 conformers, the peaks at -150° and 150° are not sampled. I 
therefore generated 1000 conformers, but the sampling does not really improve:


The algorithm of ETKDG takes a randomly generated conformer from standard 
distance geometry and minimizes it with the ET and K terms. For each bond with 
an exp. torsion term, the torsional angle will be minimized to the closest 
minima. In other words, ETKDG relies 

[Rdkit-discuss] Conformer generation does not sample well?

2016-06-22 Thread Tim Dudgeon
This topic (https://sourceforge.net/p/rdkit/mailman/message/35173301/) 
discussed using conformer generation as input into Open3DAlign.

One thing I noticed is that the conformer generation (using the 
useExpTorsionAnglePrefs=True and
useBasicKnowledge=True options) does not generate conformers that align 
well for this example. The input is based on the 1DWD_ligand.pdb 
structure in the RDKit distro. What I find is the the rotation of the 
benzamidine ring is never in the right place for alignment (the other 
two ring systems align well). This is when generating up to 10,000 
conformers.

Does this suggest that the conformer generation does not sample 
conformational space very effectively? Are there options for improving this?

Thanks

Tim



--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation problem -- stereochemistry

2011-11-09 Thread JP
Please disregard this email -- I hit the send button with too much haste...

-
Jean-Paul Ebejer
Early Stage Researcher


On 9 November 2011 15:47, JP jeanpaul.ebe...@inhibox.com wrote:


 Hi there RDKiters,

 Using the greatest/latest official RDKit release (2011_09_1), I am
 generating 50 conformers of a molecule,  (+/-) endo-2-amino norbornane,
 with specified stereochemistry.
 The smiles string I use is C1CCC(S(N[C@H]2[C@H]3CC[C@@H](C2)C3)(=O)=O)C1

 However the stereochemistry of the bridge is not always maintained in the
 conformers generated.  In particular I am attaching my output file
 (50confs.sdf) - and this can be seen in molecules 12 (good) and 13 (bad) -
 starting to count from 1 (!).

 I am also attaching two pictures (good.png and bad.png) of the output with
 the inverted stereochemistry of the bridge highlighted in the bad
 structure.
 Also, please find the minimal script (conformer_ex.py) which I used to
 generate these conformers.

 Any ideas what I can do to rectify this situation?

 -
 Jean-Paul Ebejer
 Early Stage Researcher

--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Conformer Generation using RDKit (basic)

2010-10-08 Thread Jean-Paul Ebejer
Is it possible to automatically generate say - 50 conformers out of a SMILES
string ?

Something like -

x = generateConformers(50, 'C1CCC1OC')

Cheers
JP
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer Generation using RDKit (basic)

2010-10-08 Thread Greg Landrum
Hi,

On Fri, Oct 8, 2010 at 6:11 PM, Jean-Paul Ebejer
jeanpaul.ebe...@inhibox.com wrote:

 Is it possible to automatically generate say - 50 conformers out of a SMILES
 string ?
 Something like -
 x = generateConformers(50, 'C1CCC1OC')

There's not anything quite that simple out-of-the-box, but it wouldn't
be hard to do on your own. The calls you need to make are:

[6] from rdkit.Chem import AllChem
[7] m=AllChem.AddHs(AllChem.MolFromSmiles('C1CCC1OC'))
[8] cids = AllChem.EmbedMultipleConfs(m,50)

you should probably then check the list of conformation ids to make
sure you actually got 50:
[11] len(cids)
Out[11] 50

The next likely question is what to do with those conformations,
Here's an example of writing them all to an SD file:

[12] w = AllChem.SDWriter('mol.confs.sdf')
[13] for cid in cids: w.write(m,confId=cid)
   :
[14] w.close()

In the past there have been several posts to the mailing list about
generating 3D coordinates with the RDKit that include some useful
tips. I'd suggest searching the list archive
(http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/) to
find these.

Best Regards,
-greg

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss