Re: [Rdkit-discuss] atom indexing in mol and conformer

2022-06-19 Thread Sereina Riniker
Dear Ling,

Yes, the atom indexing is the same for all conformers of a molecule.

Best regards,
Sereina



> On 19 Jun 2022, at 00:04, Ling Chan  wrote:
> 
> Dear colleagues,
> 
> Just wonder if the atom indexing in a conformer is always identical to that 
> of the parent molecule? I suspect it is but would like to confirm.
> 
> Specifically, I would like to confirm that
>   for conf in mol.GetConformers():
> conf.GetAtomPosition(idx)
> always corresponds to the atom
>   mol.GetAtomWithIdx(idx)
> 
> Thank you.
> 
> Ling
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] TFD and RMSD for macrocycles

2022-01-19 Thread Sereina Riniker
Hi Paul,

TFD was developed for drug-like molecules with small rings. The torsions of 
ring bonds are therefore summed up into a single average value for each ring 
(see Figure 1 in J. Chem. Inf. Model., 52, 1499, 2012). This makes of course 
not much sense for a macrocycle and likely causes your results.

If you are interested in the macrocycle conformation alone, I would recommend 
to either use ringRMSD (with or without beta-atoms) or a torsional-angle RMSD 
(careful with the periodicity). The latter has the advantage that no alignment 
is needed.

Examples for the use of ringRMSD with macrocycles are in the ETKDG version 3 
paper (J. Chem. Inf. Model., 60, 2044, 2020) or the recent noeETKDG paper 
(https://pubs.acs.org/doi/10.1021/acs.jcim.1c01165).

Hope this helps.

Best,
Sereina


> On 19 Jan 2022, at 16:37, mix_of_reasons via Rdkit-discuss 
>  wrote:
> 
> 
> Hi RDKitters,
> 
> I am using the RDKit implementation of TFD to examine conformational 
> differences between macrocycles and to cluster their conformations. In some 
> basic testing of a set of conformations of the same macrocycle from the PDB I 
> find something unexpected:
> 
> #mollist = list of conformations of the same molecule from the PDB
> 
> tfds = []
> rmsds = []
> for m in mollist:
> for n in mollist:
> tfd = TorsionFingerprints.GetTFDBetweenMolecules(m, n, maxDev='spec', 
> useWeights=False)
> rmsd = GetBestRMS(m,n)
> tfds.append(tfd)
> rmsds.append(rmsd)
> 
> #Plot the two lists
> 
> 
> 
> Comfortingly, all cases of RMS = 0 also give TFD = 0.
> According to the original paper a TFD of 1 implies maximal torsional 
> deviation, yet here I see a very low RMSD (0.3-4A, essentially insignificant 
> for molecules of this size) at TFD = 1.
> Also useWeights = True in the code above gives TFDs > 1, which is clearly not 
> possible in the spirit of the original idea, but probably arises from there 
> not really being a graph centre in a macrocycle.
> 
> The idea of clustering macrocycles based on some measure of distance in 
> torsion space is very appealing, but I am concerned by TFD = 1 being 
> calculated for conformations that have essentially the same geometry. Any 
> suggestions on how to proceed?
> 
> 
> Paul.
> 
> 
> Sent with ProtonMail <https://protonmail.com/> Secure Email.
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Failing when embeding molecule with several fragments

2020-11-05 Thread Sereina Riniker
Dear Pablo,

The RDKit conformer generator is not really suitable to generate coordinates 
for arrangements of multiple molecules. 
For this, I would go for tools implemented in MD packages.

Best regards,
Sereina


> On 5 Nov 2020, at 14:56, Pablo Ramos  wrote:
> 
> Hello everybody,
>  
> I am trying to generate 3D coordinates and optimize the system with MM.
> When optimizing, atoms overlap for one of the O=C(Cl)Cl fragments.
>  
> This is my code:
> smiles = 'Cc1ccc(N)cc1N.O=C(Cl)Cl.O=C(Cl)Cl'
> m = Chem.MolFromSmiles(smiles)
> m = Chem.AddHs(m)
> AllChem.EmbedMolecule(m, useRandomCoords = True)
> ffu = AllChem.UFFGetMoleculeForceField(m, ignoreInterfragInteractions = False)
> ffu.Initialize()
> ffu.Minimize(maxIts = 500)
>  
> In order to be sure that this is not a problem of convergency, I 
> unsuccessfully  set  ffu.Minimize(maxIts) with a high value, as well as 
> trying with a high number of maxAttempts for the embedding.
>  
> Thanks a lot, 
>  
> Best regards,
>  
> Pablo Ramos
> Ph.D. at Covestro Deutschland AG
>  
> 
> 
>  
> covestro.com <http://www.covestro.com/>
> Telephone
> +49 214 6009 7356
>  
> Covestro Deutschland AG
> COVDEAG-Chief Commer-PUR-R
> B103, R164
> 51365 Leverkusen, Germany
> pablo.ra...@covestro.com <mailto:pablo.ra...@covestro.com>
>  
>  
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net 
> <mailto:Rdkit-discuss@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 
> <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ConstrainedEmbed issue

2020-07-07 Thread Sereina Riniker
Dear Pavel and Sunhwan,

Please note that hydrogens should always be added for the embedding algorithm 
to work properly (i.e. it’s not a walk around but what should be done).
See also Section “Working with 3D Molecules” in 
https://www.rdkit.org/docs/GettingStartedInPython.html

Best regards,
Sereina



> On 7 Jul 2020, at 21:26, Sunhwan Jo  wrote:
> 
> 
> The reason constraint embed didn’t work is the molecule simply can’t be 
> embedded using the rdkit’s algorithm.
> 
>> In [25]: mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O')   
>>  
>> 
>> In [26]: AllChem.EmbedMolecule(mol_child)
>>  
>> Out[26]: -1
> 
> 
> See more discussion here:
> https://github.com/rdkit/rdkit/issues/2996 
> <https://github.com/rdkit/rdkit/issues/2996>
> 
> 
> The SMILES you posted looks valid to me and doesn’t look that complicated, 
> but the anyway I think
> somehow the RDKit’s algorithm tripped up and couldn’t finish embedding 
> without some help. Hope
> someone with more in-depth insight can help here.
> 
> 
> Anyway, for a walk around, adding H seems to do the trick:
> 
>> In [39]: mol = AllChem.AddHs(mol_child)  
>>  
>> 
>> In [40]: AllChem.EmbedMolecule(mol)  
>>  
>> Out[40]: 0 # worked
>> 
>> In [41]: AllChem.ConstrainedEmbed(mol, mol_parent)   
>>  
>> Out[41]:  # also worked
>> 
> 
> 
> 
> Sunhwan
> 
> 
> 
> 
>> On Jul 7, 2020, at 12:36 AM, Pavel Polishchuk > <mailto:pavel_polishc...@ukr.net>> wrote:
>> 
>> Hi all,
>> 
>>   I have an issue with ConstrainedEmbed and I cannot figure out what exactly 
>> causes this.
>>   I have a molecule C[C@@H]1C1=O with 3D coordinates in 1.mol file 
>> (attached). And I want to generate coordinates for another structure with 
>> this core -
>> C[C@@H]1CC[C@H](O)CC1=O.
>> 
>>   This is usual way which causes issue with embedding and the corresponding 
>> error.
>> 
>> mol_parent = Chem.MolFromMolFile('1.mol')
>> mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O')
>> try:
>> mol = AllChem.ConstrainedEmbed(mol_child, mol_parent)
>> except ValueError as e:
>> print(e)
>> 
>>   If I add explicit hydrogens the issue disappears.
>> 
>> mol_parent = Chem.MolFromMolFile('1.mol')
>> mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O')
>> mol = AllChem.ConstrainedEmbed(Chem.AddHs(mol_child), mol_parent)
>> 
>>   If I do not use pre-defined coordinates - everything works well.
>> 
>> mol_parent = Chem.MolFromSmiles('C[C@@H]1C1=O')
>> AllChem.EmbedMolecule(mol_parent)
>> mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O')
>> mol = AllChem.ConstrainedEmbed(mol_child, mol_parent)
>> 
>>   Does ugly coordinates in 1.mol file cause the embedding issue? Or the 
>> issue is caused by some implicit properties of a molecule? How to solve this 
>> properly?
>> 
>> Kind regards,
>> Pavel.
>> <1.mol>___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net 
>> <mailto:Rdkit-discuss@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] ETKDG improvement for small and large rings

2020-05-11 Thread Sereina Riniker
Dear RDKit Users, 

For your information (and to make a bit of advertisement): 
We have recently developed and published an extension of the ETKDG conformer 
generator to improve sampling of small and large rings, which is available in 
the 2020.03 release of the RDKit.

Shuzhe Wang, Jagna Witek, Greg Landrum, Sereina Riniker, J. Chem. Inf. Model., 
60, 2044 (2020)
"Im­prov­ing Con­former Gen­er­a­tion for Small Rings and Mac­ro­cycles Based 
on Dis­tance Geo­metry and Ex­per­i­mental Torsional-​Angle Pref­er­ences”
https://pubs.acs.org/doi/10.1021/acs.jcim.0c00025

If you want to try it out, Shuzhe has added a section in the RDKit cookbook to 
showcase the new functionalities:
https://github.com/rdkit/rdkit/blob/master/Docs/Book/Cookbook.rst#conformer-generation-with-etkdg

We hope that you find it useful and we’re happy for any feedback!

Best regards,
Sereina


 - - - 

Prof. Dr. Sereina Riniker
ETH Zürich
Laboratory of Physical Chemistry
HCI G225
Vladimir-Prelog-Weg 2
8093 Zürich
+41 44 633 42 39
srini...@ethz.ch
www.riniker.ethz.ch

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] regarding hydrogens from SMILES

2019-10-08 Thread Sereina
Hi Jorgen, 

Which version of RDKit are you using? The ETKDG conformer generator (which will 
keep sp2 centers flat) has become only recently the default. If you are using 
an older RDKit version, the following code should give you a flat aromatic 
system for the SMILES you provided in your example.

m = 
Chem.MolFromSmiles('C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N’)
mH = AllChem.AddHs(m)
AllChem.EmbedMolecule(mH, params=AllChem.ETKDGv2())

Best regards,
Sereina


> On 8 Oct 2019, at 18:18, Paolo Tosco  wrote:
> 
> Hi Jorgen,
> 
> use the MMFF94s variant of the forcefield if you wish to force trigonal 
> nitrogens to be planar:
> 
> AllChem.MMFFOptimizeMolecule(m2, mmffVariant="MMFF94s")
> 
> More information here:
> https://doi.org/10.1002/(SICI)1096-987X(199905)20:7%3C720::AID-JCC7%3E3.0.CO;2-X
>  
> <https://doi.org/10.1002/(SICI)1096-987X(199905)20:7%3C720::AID-JCC7%3E3.0.CO;2-X>
> Cheers,
> p.
> 
> On 10/08/19 15:27, Jorgen Simonsen wrote:
>> Cheers Paolo, 
>> 
>> It looks like that it keeps sp3 as the optimal geometry and not sp2. 
>> The optimization did converge :
>> 
>> AllChem.MMFFOptimizeMolecule(m2,)
>> 
>> #returned 1
>> 
>> I think it is getting the types wrong or I have to specify the types? 
>> 
>> 
>> 
>> On Tue, Oct 8, 2019 at 10:10 AM Paolo Tosco > <mailto:paolo.tosco.m...@gmail.com>> wrote:
>> Hi Jorgen,
>> 
>> optimizing your molecule geometry with UFF or MMFF should fix the problem:
>> 
>> AllChem.UFFOptimizeMolecule(m2)
>> 
>> or
>> 
>> AllChem.MMOptimizeMolecule(m2)
>> 
>> see rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule 
>> <https://www.rdkit.org/docs/source/rdkit.Chem.rdForceFieldHelpers.html?highlight=optimize#rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule>
>>  or rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule 
>> <https://www.rdkit.org/docs/source/rdkit.Chem.rdForceFieldHelpers.html?highlight=optimize#rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule>.
>> 
>> Cheers,
>> p.
>> 
>> On 10/08/19 14:41, Jorgen Simonsen wrote:
>>> Hi all, 
>>> 
>>> I am trying to built 3D structures from SMILES which for most of the 
>>> molecules works fine - I get the SMILES from pubchem ('canonical_smiles' 
>>> and 'isomeric_smiles') but some of the molecules they hydrogens are not 
>>> added correctly and are out of plane - e.g. amide group in ATP ( see below 
>>> for an example or arginine in a peptide). 
>>> 
>>> I use the following code to generate the 3D structure :
>>> 
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> m1 = 
>>> Chem.MolFromSmiles('C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N')
>>> 
>>> m2 = Chem.AddHs(m1)
>>> AllChem.EmbedMolecule(m2)
>>> 
>>> w = Chem.SDWriter('foo.sdf')
>>> w.write(m2)
>>> 
>>> # or to mol file
>>> 
>>> print(Chem.MolToMolBlock(m2),file=open('foo.mol','w+'))
>>> 
>>> How to insure that the atomtype are correct ? 
>>> 
>>> Thanks in advance
>>> 
>>> Best 
>>> Jorgen
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net 
>>> <mailto:Rdkit-discuss@lists.sourceforge.net>
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 
>>> <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
>> 
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Understanding Coloring in Similarity Maps

2019-08-22 Thread Sereina
Hi Axel,

What is calculated in the function GetAtomicWeightsForModel() is the difference 
between the probability value of the complete molecule (“base probability”) and 
the probability value when the bits of a certain atom are deleted. 

In the cookbook (and based on a quick glance also in your code), the 
probability of the active class is used as the measure for the similarity maps 
(that’s defined in the getProba() helper function). This means that any atom 
whose missing bits lead to an increase in the probability to be active is 
colored green. If it leads to a decrease, it gets colored pink. 

Now if you have an inactive molecule then your base probability for the active 
class is close to zero. In your cases it looks like nearly all of the atoms in 
the molecule are necessary to make these molecules be considered inactive. In 
other words, deleting any of green colored atoms results in a higher 
probability to be active – although it might still be below 50% (note that the 
color range is not standardized globally but based on the largest difference 
observed in the molecule).

I hope this helps.

Best,
Sereina 


> On 22 Aug 2019, at 11:38, Axel Pahl  wrote:
> 
> Dear fellow RDKitters,
> 
> I am experimenting with the classification example from the Cookbook [1] 
> using a RandomForestClassifier and Similarity Maps for visualization.
> I need, however, some help with the interpretation of the coloring in the 
> similarity map.
> In the attached example, the compounds were correctly predicted ("AC_Pred") 
> as being inactive ("0") with a high probability.
> But the corresponding similarity maps show mainly green areas, indicating (in 
> my understanding) a positive contribution to the activity class, which should 
> have lead to a different prediction.
> 
> What would be the correct interpretation of the coloring?
> Many thanks in advance for any help.
> 
> Kind regards,
> Axel
> 
> P.S.: The code is available in a repo [2], an example notebook can be found 
> in the tutorials folder.
> 
> [1] http://www.rdkit.org/docs/Cookbook.html#using-scikit-learn-with-rdkit 
> <http://www.rdkit.org/docs/Cookbook.html#using-scikit-learn-with-rdkit>
> [2] https://github.com/apahl/mol_frame <https://github.com/apahl/mol_frame>
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PDBBlock file

2018-07-06 Thread Sereina
The default conformer generator in RDKit is plain distance geometry, which is 
known to not be able to provide perfectly flat aromatic rings.
You can use the ETKDG conformer generator instead:

AllChem.EmbedMolecule(mol, params=AllChem.ETKDGv2())

Best,
Sereina


> On 6 Jul 2018, at 17:54, Phuong Chau  wrote:
> 
> Follow up question regarding to PDB file:
> 
> Thank you for your help. I was able to create the non-zero coordinates of the 
> chemical. However, when I tried to view it on VMD, the chemical that I used 
> is c1c1 (benzene). The ring itself (and the atoms coming off it, which 
> are mostly hydrogens) should all be in-plane but the pdb file shows that it 
> is slightly puckered. Would anyone explain this for me? How can I make it be 
> in-plane ?
> 
> Here is a screenshot of the molecule.
> 
> ​
> 
> On Fri, Jul 6, 2018 at 6:21 AM, Dmitri Maziuk via Rdkit-discuss 
>  <mailto:rdkit-discuss@lists.sourceforge.net>> wrote:
> On 7/5/2018 1:39 PM, Paolo Tosco wrote:
> 
> As the PDB format includes no stereochemistry, no coordinates are needed, and 
> by default they are zero, as the molecule does not have a conformation yet.
> 
> Hmm. One could argue that PDB format *is* 3D coordinates, so a block with all 
> zeroes is quite pointless. And of course counter-intuitive.
> 
> Dima
> 
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot 
> <http://sdm.link/slashdot>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net 
> <mailto:Rdkit-discuss@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 
> <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> 
> 
> 
> -- 
> Phuong Chau 
> Smith College '20
> Engineering Major 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! 
> http://sdm.link/slashdot___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Behavior of ETKDG / EmbedMultipleConfs

2018-01-14 Thread Sereina
Hi Andy,

If -1 is used for the random number seed, the RDKit will use the current date 
(including seconds) as seed (Greg, please correct me if I’m wrong). Therefore, 
you get a different seed every time you run the script. If you use a fixed 
seed, you will generate the same conformations every time you run it. Note that 
if pruneRMSthresh > 0, the generated conformers will be pruned, i.e. conformers 
with a RMS < cutoff to any previous conformer will be discarded. As this 
happens at the very end of the conformer generation routine, no additional 
conformers will be generated to replace the discarded ones. This is why you get 
a varying number of conformers. 

I have run your script and I get the same weird third conformation. This should 
certainly not happen. I will look into it.

Best,
Sereina


> On 12 Jan 2018, at 19:17, Andy Jennings <andy.j.jenni...@gmail.com> wrote:
> 
> Hi RDKitters,
> 
> Whilst looking at generating some conformations of molecules using the ETKDG 
> method with EmbedMultipleConfs I've come across some strange (to me) behavior.
> 
> When I generate conformations of some molecules with the randomSeed as -1 the 
> result is a variable number of conformations. That's not the strangest aspect 
> though - some of the conformations are quite bizarre based upon any geometry 
> rules I can think of. However, when the randomSeed is set to a fixed number 
> the odd behavior goes away and I get only reasonable conformations.
> 
> To illustrate here is some code (please no criticism of my terrible style!):
> 
> ### CODE ###
> from rdkit import Chem
> from rdkit.Chem import AllChem
> import sys
> 
> acamide = Chem.MolFromSmiles('O=C(NC=C)c1c1')
> ETKDG = 1
> _seed = -1
> m = Chem.AddHs(acamide)
> n = 3
> ps = AllChem.ETKDG()
> ps.pruneRmsThresh = 0.5
> ps.numThreads = 0
> ps.randomSeed = _seed
> fixIt = 0
> for i in range(0,100):
> ids = AllChem.EmbedMultipleConfs(m, n, ps)
> if fixIt:
> for _id in ids: AllChem.UFFOptimizeMolecule(m, confId = _id)
> sys.stderr.write('%d,' % len(ids))
> if len(ids) > 2:
> outStream = Chem.SDWriter('test.sdf')
> for _id in ids:
> outStream.write(m,confId = _id)
> outStream.flush()
> outStream.close()
> sys.stderr.write('\n')
> break
> 
> ### END CODE ###
> 
> 
> This takes the smiles string for a simple acrylamide and generates a max of 3 
> conformations for the molecule. The loop runs 100 times and halts when 3 
> conformations are found - which is the sign of a bad conformation being 
> generated. When I run this the number of conformations generated each time 
> varies between 1-3 and it does so differently from run to run.
> 
> For instance:
> run #1: 
> 2,2,1,1,2,2,2,2,2,2,1,2,2,1,2,1,2,1,2,2,1,2,1,1,1,2,2,2,2,2,1,2,2,2,2,2,2,2,1,2,2,1,2,2,2,2,1,1,2,2,3,
> run #2: 2,1,2,2,2,1,1,3,
> run #3: 2,2,2,1,2,2,2,2,1,2,2,1,2,1,2,2,3,
> and so on
> 
> When I visually inspect test.sdf that results from a generation of 3 
> conformers I find that one of the conformations has a very odd amide nitrogen 
> geometry - almost linear between the heavy atoms.
> 
> If I change _seed to a number such as '1' I get a single conformation for 
> every run.
> 
> If I implement the UFF optimization (with fixIt = 1) then I'll still get 
> multiple conformations but they all look reasonable.
> 
> So, I'm not sure if there is some systematic problem here or I'm just failing 
> to understand the appropriate way to implement this form of conformational 
> search. Any insights are welcome.
> 
> Best,
> Andy
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! 
> http://sdm.link/slashdot___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation

2017-10-25 Thread Sereina
Hi Paul,

Regarding your second question:

> On 25 Oct 2017, at 18:36, Paul Hawkins <phawk...@eyesopen.com> wrote:
> 
> Also, once I generate the conformers what is best way to cluster them by RMSD 
> so that each conformer has a minimum RMSD to all the others in the set?

I think the function AllChem.GetConformerRMSMatrix() might do (parts of) what 
you want.

Best,
Sereina--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] difficulties with AllChem.EmbedMultipleConfs() on a macrocycle

2017-03-03 Thread Sereina
Hi Curt,

Yes, I’m sorry, I used accidentally a modified version of mine where I played 
with a new feature to favour longer distances. With the standard RDKit version, 
I also get no conformation for the molecule without Hs. I indend to commit this 
feature as an option, but did not yet manage to do so. I will try to get to it 
in the next days and then write a short email to the mailing list when it is in.

Best,
Sereina


On 03 Mar 2017, at 00:15, Curt Fischer <curt.r.fisc...@gmail.com> wrote:

> Thanks for the notebook Sereina!
> 
> Unfortunately when I run it I get different results.  In your version, the 
> very first call to EmbedMolecule() returns 0, which presumably means that 
> embedding went OK.
> 
> ## Embed the molecule without Hs
> AllChem.EmbedMolecule(m, useExpTorsionAnglePrefs=True, useBasicKnowledge=True)
> 
> Out[7]: 0
> 
> 
> When I run your notebook, this same call returns -1.  Maybe my rdkit is 
> different than yours?  I'm using '2016.09.2' on Mac OSX 64-bit.
> 
> 
> 
> On Thu, Mar 2, 2017 at 12:00 PM, Sereina <sereina.rini...@gmail.com> wrote:
> Hi Curt,
> 
> This is an interesting one. If you add the hydrogens before generating the 
> conformer as in your example, then no conformation can be found. However, if 
> you add them *after* the conformer generation, it works fine. Maybe that 
> could serve as a work around for you. I attach a notebook as illustration. As 
> this occurs with both DG and ETKDG, it may be due to the tests to ensure that 
> the chiral centers are correct. I will have a closer look (hopefully with 
> Greg’s help).
> 
> Best,
> Sereina
> 
> 
> 
> 
> 
> On 02 Mar 2017, at 19:34, Curt Fischer <curt.r.fisc...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> I really like combination of rdkit and py3dmol and have been able to 
>> replicate e.g. Greg's notebook here: 
>> http://nbviewer.jupyter.org/github/greglandrum/rdkit_blog/blob/master/notebooks/Trying%20py3Dmol.ipynb
>> 
>> But I can't seem to get AllChem.EmbedMultipleConfs() to generate any valid 
>> conformers for a macrotriolide, macrosphelide A.
>> 
>> macrosphelide_a_smiles = 
>> 'C[C@H]1CC(O[C@H](C)[C@H](O)/C=C/C(O[C@@H](C)[C@@H](O)/C=C/C(O1)=O)=O)=O'
>> m = Chem.MolFromSmiles(macrosphelide_a_smiles)
>> mh = Chem.AddHs(m)
>> AllChem.EmbedMultipleConfs(mh, useExpTorsionAnglePrefs=True, 
>> useBasicKnowledge=True)
>> mb = Chem.MolToMolBlock(mh)
>> 
>> The EmbedMultipleConfs() call never terminates for me.  If I use a non-zero 
>> value for maxAttempts, the call does terminate, but when I look at mb, the 
>> coordinates for all atoms are zero.
>> 
>> I've tried playing around with a few of the other options, without luck.  
>> Either all atom coordinates are still zero after EmbedMultipleConfs(), or 
>> the function call never terminates.
>> 
>> Any chance someone knows how to coax this function into yielding a useful 
>> conformation for my molecule?
>> 
>> Curt
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! 
>> http://sdm.link/slashdot___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> 
> 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MolFromPDBBlock and heterocycles

2016-09-07 Thread Sereina Riniker
Hi Steven,

The PDB reader in the RDKit doesn’t determine any bond orders - everything
is read as a single bond.
In order to set the bond orders, you need to call the
AssignBondOrdersFromTemplate() function using a reference molecule
generated from SMILES (or SDF).

Here is some example code from the docs:

>>> from rdkit.Chem import AllChem
>>> template = AllChem.MolFromSmiles("CN1C(=NC(C1=O)(c2c2)c3c3)N")
>>> mol = AllChem.MolFromPDBFile(os.path.join(RDConfig.RDCodeDir, 'Chem',
'test_data', '4DJU_lig.pdb'))
>>> len([1 for b in template.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
8
>>> len([1 for b in mol.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
22

Now assign the bond orders based on the template molecule
>>> newMol = AllChem.AssignBondOrdersFromTemplate(template, mol)
>>> len([1 for b in newMol.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
8

Note that the template molecule should have no explicit hydrogens
else the algorithm will fail.

Hope this helps.

Best,
Sereina


2016-09-07 17:16 GMT+02:00 Steven Combs <steven.com...@gmail.com>:

> Hello!
>
> I have a pdb block that I am working with, which is attached to this
> email. The ligand has aromatic ring structures in it; however, when it is
> read into RDKit and converted into a smiles string, the aromatic rings are
> converted into aliphatic rings. Any thoughts?
>
> Here is the python code:
>
> def extract_data( filename):
> extracted_info = ""
> with open(filename) as f:
> for line in f.readlines():
> if "HETATM" in line:
> extracted_info += ( line)
> return extracted_info
>
> for index, filename in enumerate(solution_pdb_filenames):
> row = extract_data( filename)
> m = Chem.MolFromPDBBlock(row, sanitize=True, removeHs=False )
> Chem.SetHybridization(m)
> Chem.SetAromaticity(m)
> Chem.SanitizeMol(m, 
> sanitizeOps=Chem.rdmolops.SanitizeFlags.SANITIZE_ALL)
> #not needed since sanitizing during read in, but trying to figure out if it
> actually worked
> print ("Parsing file " + str(index) + " of " +
> str(len(solution_pdb_filenames)))
> print (Chem.MolToSmiles(m, kekuleSmiles=True, allHsExplicit=True))
>
> The output smile string is:
>
> [H][O][CH]1[NH][CH]([C]([H])([H])[CH]([OH])[OH])[CH]([C]([H]
> )([H])[C]([H])([H])[H])[CH]([CH]([OH])[CH]2[CH]([H])[CH]([
> H])[CH]([H])[CH]([N]([H])[H])[CH]2[H])[CH]1[N]([C]([H])([H])
> [H])[C]([H])([H])[H]
>
> Steven Combs
>
>
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MolFromPDBBlock and heterocycles

2016-09-07 Thread Sereina
Hi Steven,

The PDB reader in the RDKit doesn’t determine any bond orders - everything is 
read as a single bond. 
In order to set the bond orders, you need to call the 
AssignBondOrdersFromTemplate() function using a reference molecule generated 
from SMILES (or SDF).

Here is some example code from the docs:

>>> from rdkit.Chem import AllChem
>>> template = AllChem.MolFromSmiles("CN1C(=NC(C1=O)(c2c2)c3c3)N")
>>> mol = AllChem.MolFromPDBFile(os.path.join(RDConfig.RDCodeDir, 'Chem', 
>>> 'test_data', '4DJU_lig.pdb'))
>>> len([1 for b in template.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
8
>>> len([1 for b in mol.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
22

Now assign the bond orders based on the template molecule
>>> newMol = AllChem.AssignBondOrdersFromTemplate(template, mol)
>>> len([1 for b in newMol.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
8

Note that the template molecule should have no explicit hydrogens
else the algorithm will fail.

Hope this helps.

Best,
Sereina


On 07 Sep 2016, at 17:16, Steven Combs <steven.com...@gmail.com> wrote:

> Hello!
> 
> I have a pdb block that I am working with, which is attached to this email. 
> The ligand has aromatic ring structures in it; however, when it is read into 
> RDKit and converted into a smiles string, the aromatic rings are converted 
> into aliphatic rings. Any thoughts?
> 
> Here is the python code:
> 
> def extract_data( filename):
> extracted_info = ""
> with open(filename) as f:
> for line in f.readlines():
> if "HETATM" in line:
> extracted_info += ( line)
> return extracted_info
> 
> for index, filename in enumerate(solution_pdb_filenames):
> row = extract_data( filename)
> m = Chem.MolFromPDBBlock(row, sanitize=True, removeHs=False ) 
> Chem.SetHybridization(m)
> Chem.SetAromaticity(m)
> Chem.SanitizeMol(m, 
> sanitizeOps=Chem.rdmolops.SanitizeFlags.SANITIZE_ALL) #not needed since 
> sanitizing during read in, but trying to figure out if it actually worked
> print ("Parsing file " + str(index) + " of " + 
> str(len(solution_pdb_filenames)))
> print (Chem.MolToSmiles(m, kekuleSmiles=True, allHsExplicit=True))
> 
> The output smile string is:
> 
> [H][O][CH]1[NH][CH]([C]([H])([H])[CH]([OH])[OH])[CH]([C]([H])([H])[C]([H])([H])[H])[CH]([CH]([OH])[CH]2[CH]([H])[CH]([H])[CH]([H])[CH]([N]([H])[H])[CH]2[H])[CH]1[N]([C]([H])([H])[H])[C]([H])([H])[H]
> 
> Steven Combs
> 
> 
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Strange behavior with MMFFHasAllMoleculeParams()

2016-08-03 Thread Sereina
Dear all,

I stumbled upon a - to me - rather strange behavior with 
MMFFHasAllMoleculeParams().

I want to generate a molecule from SMILES, check if all MMFF parameters are 
present, add hydrogens and generate conformers. However, the outcome (error or 
not error) depends on the order of checking of the MMFF parameters and adding 
hydrogens. 

Everything is fine if I first add the hydrogens:
In [1]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')

In [1]: m = AllChem.AddHs(m)

Out[2]: AllChem.MMFFHasAllMoleculeParams(m)
Out[2]: True

In [3]: AllChem.EmbedMultipleConfs(m, numConfs=100)
Out[3]: 

But here’s what happens when I first check the MMFF parameters:
In [4]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')

In [5]: AllChem.MMFFHasAllMoleculeParams(m)
Out[5]: True

In [6]: m = AllChem.AddHs(m)

In [7]: AllChem.EmbedMultipleConfs(m, numConfs=100)
RDKit ERROR: [08:41:02] Explicit valence for atom # 11 N, 4, is greater than 
permitted
---
ValueErrorTraceback (most recent call last)
 in ()
> 1 AllChem.EmbedMultipleConfs(m, numConfs=100)

ValueError: Sanitization error: Explicit valence for atom # 11 N, 4, is greater 
than permitted

Interestingly, if I do the check first, but then remove the hydrogens before 
adding hydrogens, things work again:
In [8]: m = Chem.MolFromSmiles('Cc1nc(=O)c(C[NH3+])c(-c2c[nH]c3c23)[nH]1')

In [9]: AllChem.MMFFHasAllMoleculeParams(m)
Out[9]: True

In [10]: m = AllChem.RemoveHs(m)

In [11]: m = AllChem.AddHs(m)

In [12]: AllChem.EmbedMultipleConfs(m, numConfs=100)
Out[12]: 

I cannot really explain the behavior. It only happens for some molecules. Is 
MMFFHasAllMoleculeParams() modifying the molecule, i.e. already addying 
hydrogens?

Best,
Sereina--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation does not sample well?

2016-06-29 Thread Sereina
orsional angle will be minimized to the closest 
minima. In other words, ETKDG relies on distance geometry for the sampling. 
Nevertheless, I will have a look at this particular pattern in the next version 
of ETKDG to see if we can improve it.

Best,
Sereina



On 22 Jun 2016, at 10:41, Tim Dudgeon <tdudgeon...@gmail.com> wrote:

> This topic (https://sourceforge.net/p/rdkit/mailman/message/35173301/) 
> discussed using conformer generation as input into Open3DAlign.
> 
> One thing I noticed is that the conformer generation (using the 
> useExpTorsionAnglePrefs=True and
> useBasicKnowledge=True options) does not generate conformers that align 
> well for this example. The input is based on the 1DWD_ligand.pdb 
> structure in the RDKit distro. What I find is the the rotation of the 
> benzamidine ring is never in the right place for alignment (the other 
> two ring systems align well). This is when generating up to 10,000 
> conformers.
> 
> Does this suggest that the conformer generation does not sample 
> conformational space very effectively? Are there options for improving this?
> 
> Thanks
> 
> Tim
> 
> 
> 
> --
> Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Morgan atom invariants in the atom-pairs fingerprint?

2016-06-25 Thread Sereina
Hi Greg and Nadine,

Thanks a lot for your answers and the explanation.
I will have a look at either hashing the invariants to 9 bits or removing the 
limitation.

Best,
Sereina


On 24 Jun 2016, at 13:20, Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Sereina,
> 
> At the moment, for historical reasons, the atom pair and topological torsion 
> fingerprint is limited to invariants that are at most 9 bits long (i.e. 
> <512). I think it should be possible to remove the restriction, but it 
> requires a bit of rethinking of the way the code works.
> 
> What you could do is use "the dictionary trick" and assign small integers to 
> the connectivity invariants that you see in your dataset and just pass those 
> along. This is a bit of extra book-keeping code that you'd need to deal with, 
> but should work.
> 
> -greg
> 
> 
> 
> On Fri, Jun 24, 2016 at 12:10 PM, Sereina <sereina.rini...@gmail.com> wrote:
> Hi RDKitters,
> 
> Is there a way to use the Morgan (connectivity) atom invariants in the atom 
> pairs fingerprint?
> I tried naively the following:
> 
> inv = AllChem.GetConnectivityInvariants(m)
> AllChem.GetHashedAtomPairFingerprintAsBitVect(m, atomInvariants=inv)
> 
> but it gives me the error “ValueError: list element larger than allowed value"
> 
> Any ideas would be greatly appreciated.
> 
> Best,
> Sereina
> 
> --
> Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> 

--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Morgan atom invariants in the atom-pairs fingerprint?

2016-06-24 Thread Sereina
Hi RDKitters,

Is there a way to use the Morgan (connectivity) atom invariants in the atom 
pairs fingerprint?
I tried naively the following:

inv = AllChem.GetConnectivityInvariants(m)
AllChem.GetHashedAtomPairFingerprintAsBitVect(m, atomInvariants=inv)

but it gives me the error “ValueError: list element larger than allowed value"

Any ideas would be greatly appreciated.

Best,
Sereina--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Getting to grips with Open3DAlign

2016-06-21 Thread Sereina
I just tried Tim’s example (or a version of it that Greg sent me, 
respectively). 
What is missing are the hydrogens for the torsion terms of ETKDG to work 
properly. Before generating the conformations AllChem.AddHs() should be called.

Best,
Sereina


On 22 Jun 2016, at 06:48, Sereina <sereina.rini...@gmail.com> wrote:

> Based on the code snippets, Paolo has not used the basic-knowledge terms 
> whereas Tim did. 
> 
> When setting useExpTorsionAnglePrefs=True and useBasicKnowledge=True, a 
> minimization is in principle not necessary anymore (unless there are 
> aliphatic rings, because we currently don’t have torsion rules for them).
> 
> Best,
> Sereina
> 
> 
> On 22 Jun 2016, at 05:02, Greg Landrum <greg.land...@gmail.com> wrote:
> 
>> I don't have anything to add to the pieces about the alignment (Paolo is the 
>> expert there!), but one comment on the conformation generation: If you used 
>> the background knowledge terms in the embedding, I don't think you should be 
>> getting the really distorted aromatic rings. Since in this case that does 
>> happen, for at least some conformations, I suspect there may be something 
>> wrong in the code.
>> 
>> I'll take a look at that (and ask Sereina too).
>> 
>> Best,
>> -greg
>> 
>> 
>> On Tue, Jun 21, 2016 at 10:30 PM, Paolo Tosco <paolo.to...@unito.it> wrote:
>> Dear Tim,
>> 
>> the Align() method returns an RMSD value, which however is computed only on 
>> a limited number of atom pairs, namely those that the algorithm was able to 
>> match between the two molecules, so a low value is not particularly 
>> informative of the overall goodness of the alignment, as it only indicates 
>> that the matched atoms were matched nicely, but there might only be a few of 
>> those in the core, while side chains are scattered all over.
>> The Score() method instead returns the O3AScore for the alignment, which is 
>> a better way to assess the quality of the superimposition.
>> 
>> The other problem in your script is that the i index is incremented before 
>> recording it in the lowest/highest variables, so the confIds are shifted by 
>> 1, as the conformation index in the RDKit is 0-based.
>> 
>> I also noticed that without minimizing the conformations the aromatic rings 
>> look quite distorted, so I added a MMFF minimization, and I increased the 
>> number of generated conformations and the pruneRmsThreshold. Setting to 
>> False the experimental torsion angle preferences and basic knowledge about 
>> rings seems to yield a larger variety of geometries which helps reproducing 
>> this quite peculiar x-ray geometry which is probably not so commonly found. 
>> Please find the modified script below.
>> 
>> Hope this helps, kind regards
>> Paolo
>> 
>> 
>> #!/usr/bin/env python
>> 
>> 
>> from rdkit import Chem, RDConfig
>> from rdkit.Chem import AllChem, rdMolAlign
>> 
>> ref = 
>> Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@@H](NC(=O)CNS(=O)(=O)c2ccc3c3c2)C(=O)N2C2)cc1')
>> mol1 = 
>> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb')
>> mol1 = AllChem.AssignBondOrdersFromTemplate(ref, mol1)
>> mol2 = 
>> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1PPC_ligand.pdb')
>> mol2 = AllChem.AssignBondOrdersFromTemplate(ref, mol2)
>> 
>> pyO3A = rdMolAlign.GetO3A(mol1, mol2)
>> rmsd = pyO3A.Align()
>> score = pyO3A.Score()
>> print "Orig",score
>> Chem.MolToMolFile(mol1, "orig.mol")
>> 
>> cids = AllChem.EmbedMultipleConfs(mol1, numConfs=250, maxAttempts=100,
>> pruneRmsThresh=0.5, useExpTorsionAnglePrefs=False,
>> useBasicKnowledge=False)
>> AllChem.MMFFOptimizeMoleculeConfs(mol1, mmffVariant='MMFF94s')
>> pyO3As = rdMolAlign.GetO3AForProbeConfs(mol1, mol2, numThreads=0)
>> i = 0
>> lowest = 9.9
>> highest = 0.0
>> for pyO3A in pyO3As:
>> rmsd = pyO3A.Align()
>> score = pyO3A.Score()
>> if score < lowest:
>> lowest = score
>> lowestConfId = i
>> if score > highest:
>> highest = score
>> highestConfId = i
>> i +=1
>> 
>> print "Lowest:", lowest, lowestConfId
>> print "Highest:", highest, highestConfId
>> 
>> Chem.MolToMolFile(mol1, "lowest.mol", confId=lowestConfId)
>> Chem.MolToMolFile(mol1, "highest.mol", confId=highestConfId)
>> 
>> 
>> On 06/21/16 15:41, Tim Dudgeon wrote:
>>> Hi All,
>

Re: [Rdkit-discuss] Getting to grips with Open3DAlign

2016-06-21 Thread Sereina
Based on the code snippets, Paolo has not used the basic-knowledge terms 
whereas Tim did. 

When setting useExpTorsionAnglePrefs=True and useBasicKnowledge=True, a 
minimization is in principle not necessary anymore (unless there are aliphatic 
rings, because we currently don’t have torsion rules for them).

Best,
Sereina


On 22 Jun 2016, at 05:02, Greg Landrum <greg.land...@gmail.com> wrote:

> I don't have anything to add to the pieces about the alignment (Paolo is the 
> expert there!), but one comment on the conformation generation: If you used 
> the background knowledge terms in the embedding, I don't think you should be 
> getting the really distorted aromatic rings. Since in this case that does 
> happen, for at least some conformations, I suspect there may be something 
> wrong in the code.
> 
> I'll take a look at that (and ask Sereina too).
> 
> Best,
> -greg
> 
> 
> On Tue, Jun 21, 2016 at 10:30 PM, Paolo Tosco <paolo.to...@unito.it> wrote:
> Dear Tim,
> 
> the Align() method returns an RMSD value, which however is computed only on a 
> limited number of atom pairs, namely those that the algorithm was able to 
> match between the two molecules, so a low value is not particularly 
> informative of the overall goodness of the alignment, as it only indicates 
> that the matched atoms were matched nicely, but there might only be a few of 
> those in the core, while side chains are scattered all over.
> The Score() method instead returns the O3AScore for the alignment, which is a 
> better way to assess the quality of the superimposition.
> 
> The other problem in your script is that the i index is incremented before 
> recording it in the lowest/highest variables, so the confIds are shifted by 
> 1, as the conformation index in the RDKit is 0-based.
> 
> I also noticed that without minimizing the conformations the aromatic rings 
> look quite distorted, so I added a MMFF minimization, and I increased the 
> number of generated conformations and the pruneRmsThreshold. Setting to False 
> the experimental torsion angle preferences and basic knowledge about rings 
> seems to yield a larger variety of geometries which helps reproducing this 
> quite peculiar x-ray geometry which is probably not so commonly found. Please 
> find the modified script below.
> 
> Hope this helps, kind regards
> Paolo
> 
> 
> #!/usr/bin/env python
> 
> 
> from rdkit import Chem, RDConfig
> from rdkit.Chem import AllChem, rdMolAlign
> 
> ref = 
> Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@@H](NC(=O)CNS(=O)(=O)c2ccc3c3c2)C(=O)N2C2)cc1')
> mol1 = 
> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb')
> mol1 = AllChem.AssignBondOrdersFromTemplate(ref, mol1)
> mol2 = 
> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1PPC_ligand.pdb')
> mol2 = AllChem.AssignBondOrdersFromTemplate(ref, mol2)
> 
> pyO3A = rdMolAlign.GetO3A(mol1, mol2)
> rmsd = pyO3A.Align()
> score = pyO3A.Score()
> print "Orig",score
> Chem.MolToMolFile(mol1, "orig.mol")
> 
> cids = AllChem.EmbedMultipleConfs(mol1, numConfs=250, maxAttempts=100,
> pruneRmsThresh=0.5, useExpTorsionAnglePrefs=False,
> useBasicKnowledge=False)
> AllChem.MMFFOptimizeMoleculeConfs(mol1, mmffVariant='MMFF94s')
> pyO3As = rdMolAlign.GetO3AForProbeConfs(mol1, mol2, numThreads=0)
> i = 0
> lowest = 9.9
> highest = 0.0
> for pyO3A in pyO3As:
> rmsd = pyO3A.Align()
> score = pyO3A.Score()
> if score < lowest:
> lowest = score
> lowestConfId = i
> if score > highest:
> highest = score
> highestConfId = i
> i +=1
> 
> print "Lowest:", lowest, lowestConfId
> print "Highest:", highest, highestConfId
> 
> Chem.MolToMolFile(mol1, "lowest.mol", confId=lowestConfId)
> Chem.MolToMolFile(mol1, "highest.mol", confId=highestConfId)
> 
> 
> On 06/21/16 15:41, Tim Dudgeon wrote:
>> Hi All,
>> 
>> I'm trying to get to grips with using Open3D Align in RDKit, but hitting 
>> problems.
>> 
>> My approach is to generate random conformers of the probe molecule and align 
>> it to the reference molecule.  My example is cobbled together from the 
>> examples in the cookbook.
>> 
>> 
>> 
>> from rdkit import Chem, RDConfig
>> from rdkit.Chem import AllChem, rdMolAlign
>> 
>> ref = 
>> Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@@H](NC(=O)CNS(=O)(=O)c2ccc3c3c2)C(=O)N2C2)cc1')
>> mol1 = 
>> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb')
>> mol1 = AllChem.AssignBondOrdersFromTemplate(ref,

[Rdkit-discuss] New conformer generator: ETKDG

2015-11-26 Thread Sereina
Dear RDKitters,

Since the release 2015.09.1, a new conformer generator method is available in 
the RDKit, termed ETKDG. 
The paper describing the method and its performance is published since last 
week: http://pubs.acs.org/doi/abs/10.1021/acs.jcim.5b00654

To use the method, two flags have to be set when calling the embedding function:
AllChem.EmbedMolecule(mol, useExpTorsionAnglePrefs=True, useBasicKnowledge=True)
or
AllChem.EmbedMultipleConfs(mol, useExpTorsionAnglePrefs=True, 
useBasicKnowledge=True)

The method uses experimental torsional-angle preferences for a set of SMARTS 
patterns. If you would like to know which patterns matched your molecule, use 
the flag printExpTorsionAngles=True.

The experimental torsional-angle preferences (flag useExpTorsionAnglePrefs) and 
the “basic knowledge”-terms (flag useBasicKnowledge, for a description, see the 
paper) can also be used separately — this corresponds to the ETDG and KDG 
methods — but we found the combination (ETKDG) to perform best.

The new methods is slower than standard distance geometry, but the nice thing 
is that the generated conformers can now be used directly (e.g. the aromatic 
rings are flat), i.e. no force field minimization is required. Therefore, 
overall the new method is faster.

I hope you find the new conformer generator useful. If you encounter problems, 
please let Greg and me know.

Best,
Sereina


--
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SimilarityMaps

2015-02-20 Thread Sereina
Hi Matthew,

I think this is related to a previous mailing list item 
(https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03528.html).

It has probably something to do with the bounding boxes (they get scaled during 
the map generation process). In the previous case it was enough to set 
bbox_inches='tight' when saving the image to solve the problem.

I hope this helps.

Best,
Sereina


On 21 Feb 2015, at 02:28, Matthew Lardy mla...@gmail.com wrote:

 Hi,
 
 I am having an issue with the python similaritymaps.  I am only seeing a 
 fraction of the molecule.  Anyone else have this issue?
 
 Thanks in advance!
 Matthew
 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Torsion fingerprint deviation

2014-11-04 Thread Sereina
Dear all,

I’m happy to announce that the torsion fingerprint deviation (TFD) developed in 
Rarey’s group (J. Chem. Inf. Model., 52, 1499, 2012) are now available in the 
Python part of the RDKit. Thanks a lot to Gregori and Ilenia (+ Christin 
Schäfer) for their help at the UGM Hackathon to resolve the last disagreements 
with the paper!

Here are some small examples on the usage, a more detailed documentation can be 
found in the Cookbook. There are three wrapper functions for convenience:

TFD between two sets of conformers of a molecule:
 from rdkit.Chem import TorsionFingerprints
 tfd = TorsionFingerprints.GetTFDBetweenConformers(mol, confIds1=[0, 2], 
 confIds2=[1, 3])
The result is a list of - in this case 4 - TFD values. 

TFD between two instances of the same molecule with different conformers. If no 
confIds are specified, the first one of each molecule is taken.
 tfd = TorsionFingerprints.GetTFDBetweenMolecules(mol1, mol2)
The result is a list containing in this case a single TFD value.

For clustering or diversity picking purposes, there is also a convenience 
function to get the matrix of TFD values.
 tfdmat = TorsionFingerprints.GetTFDMatrix(mol)

The different steps of the TFD calculation can also be accessed independently:
1) A list of the torsions (one for non-ring bonds, one for ring bonds) is 
generated. For each torsion, the indices of the four atoms are stored. This has 
to be done only once for a molecule.
2) The weights for the torsions are calculated. By default, the bonds in the 
centre of the molecule receives the highest weight and the other weights are 
decreased based on the distance from the central bond. If another part of the 
molecule should have the highest weight, the user can also specify two atom 
indices that represent the most important bond. Again, this step has to be done 
only once for a molecule.
3) The torsion angles are calculated for each conformer of interest given the 
torsion lists.
4) The TFD value between two conformers is calculated given the torsion angles 
and weights.
 tors_list, ring_tors_list = TorsionFingerprints.CalculateTorsionLists(mol)
 weights = TorsionFingerprints.CalculateTorsionWeights(mol)
 torsions1 = TorsionFingerprints.CalculateTorsionAngles(mol, tors_list, 
 ring_tors_list, confId=0)
 torsions2 = TorsionFingerprints.CalculateTorsionAngles(mol, tors_list, 
 ring_tors_list, confId=1)
 tfd = TorsionFingerprints.CalculateTFD(torsions1, torsions2, 
 weights=weights)

Let me know if you encounter any problems.

Best,
Sereina
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chem.AddHs() doesn't care about compound layout

2014-08-20 Thread Sereina
Dear Michal,

Chem.AddHs() has the option “addCoords” which is normally set to False.
So, using

mol = Chem.AddHs(mol, addCoords=True)

should solve your problem.

Best,
Sereina



On 20 Aug 2014, at 19:07, Michał Nowotka mmm...@gmail.com wrote:

 Hello,
 
 Imagine I have a compound with some 2D coordinates I really like:
 
mol
 
 Now I would like to add hydrogens to it:
 
 mol = Chem.AddHs(mol)
 
 The problem is, all new hydrogen atoms will have (0,0,0) coordinates,
 which doesn't look to good...
 
 I could force recomputing 2D coords for the whole compound:
 
AllChem.Compute2dCoords(mol)
 
 But this will ruin my beautiful layout of the original, non-hydrogen part...
 
 Is it possible to layout hydrogens around my compound after I add them?
 
 Regards,
 Michał Nowotka
 
 --
 Slashdot TV.  
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] similarity maps look strange when displayed

2014-03-21 Thread sereina riniker
Hi Michal

I think this is related to a previous mailing list item (
https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03528.html
).

It has probably something to do with the bounding boxes (they get scaled
during the map generation process). In the previous case it was enough to
set bbox_inches='tight' when saving the image to solve the problem. Maybe
there is something similar for mpld3.

Best,
Sereina





2014-03-21 12:12 GMT+01:00 Michał Nowotka mmm...@gmail.com:

 Look at the following example:

 import gi
 from rdkit import Chem
 from rdkit.Chem import Draw
 from rdkit.Chem.Draw import SimilarityMaps
 import matplotlib.pyplot as plt
 mol =
 Chem.MolFromSmiles('COc12cc(C(=O)NN3CCN(c45nccnc54)CC3)oc21')
 refmol =
 Chem.MolFromSmiles('CCCN(N1CCN(c2c2OC)CC1)Cc1ccc2c2c1')
 fp = SimilarityMaps.GetAPFingerprint(mol, fpType='normal')
 fig, maxweight = SimilarityMaps.GetSimilarityMapForFingerprint(refmol,
 mol, SimilarityMaps.GetMorganFingerprint)
 plt.show()

 This displays similarity map. Unfortunately the image is not scaled to fit
 available area and it's not centered. This cases problems with mpld3
 library, which converts matplotlib to javascript:

 from rdkit import Chem
 from rdkit.Chem import Draw
 from rdkit.Chem.Draw import SimilarityMaps
 import mpld3
 mol =
 Chem.MolFromSmiles('COc12cc(C(=O)NN3CCN(c45nccnc54)CC3)oc21')
 refmol =
 Chem.MolFromSmiles('CCCN(N1CCN(c2c2OC)CC1)Cc1ccc2c2c1')
 fp = SimilarityMaps.GetAPFingerprint(mol, fpType='normal')
 fig, maxweight = SimilarityMaps.GetSimilarityMapForFingerprint(refmol,
 mol, SimilarityMaps.GetMorganFingerprint)
 mpld3.show_d3(fig)

 Again, the image is much larger then drawing area and is not aligned.

 I've tried several options: changing coordScale or scale parameter but
 without success. Any help in displaying the image correctly usiing
 plt.show() and/or mpld3.show_d3 would be appreciated.

 Regards,
 Michal Nowotka


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How to get coordinates for each atom in molecule?

2014-01-24 Thread sereina riniker
Hi Michael,

You can get the atom positions via the conformer:

m = Chem.MolFromSmiles('c1c1')
AllChem.Compute2DCoords()
pos = m.GetConformer().GetAtomPosition(0) # position of atom 0

This gives you a rdGeometry.Point3D - e.g. the x coordinates you get with:

x = pos.x

I hope this is what you were looking for.

Best,
Sereina




2014/1/24 Michał Nowotka mmm...@gmail.com

 Hi,

 Let's say I loaded a molfile containing coordinates to RDKit mol
 object or loaded it from smiles but called
 AllChem.Compute2DCoords(mol).
 Now I would like to get coordinates for each atom. Unfortunately Atom
 class doesn't have any GetCoords method but this is understandable
 since position is optional. I tried to look into properties but it
 seems that they are stored in some stage container exported from C++:

 for atom in mol.GetAtoms():
 print atom.GetPropNames()
:
 rdkit.rdBase._vectSs object at 0xa455aec
 rdkit.rdBase._vectSs object at 0xa455aec
 ...


 Some blind guesses such as: atom.GetProp('x'), atom.GetProp('X')
 failed. Mol object itself doesn't provide any method that would
 suggest that it can return coordinates

 So is there any way to get this data without parsing original molfile?


 Regards,

 Michal Nowotka


 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today.

 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PDB reader and bond perception

2014-01-14 Thread sereina riniker
Hi JP,

However I am unable to get bond orders for the protein side - am I doing
 something wrong or is this the intended behaviour ?
 I imagine I can use AssignBondOrdersFromTemplate() for the 20 amino acids
 and set these myself -- or is there a better way to do this?


I don't know why your protein doesn't get bond orders, the PDBParser should
know the standard amino acids. At least it worked for me when I tried
Chem.MolFromPDB() in the past. Which PDB structure do you try to read?


  Also, is there a way to make AssignBondOrdersFromTemplate assign bond
 orders to all matches?


The function was meant for assigning bonds based on an entire molecule. It
would probably not be so difficult to change this (with default = match
only one), if it is really needed.


 Also another thing I don't quite understand is in the following below
 code, I get a WARNING: More than one matching pattern found - picking one
 but how can my template match multiple times (this is not symettrical) ?


The way the AssignBondOrdersFromTemplate() function works is the following:
1) a copy of the template is generated where all bonds are set to single
bonds
2) this single-bonds copy is used for a substructure match with the query
molecule
3) bond orders are assigned based on this match and the original template

If you get this warning, it means that there is some symmetry in the
all-single-bonds-stage of your molecule. In your case, I guess it's the
carboxylic acids which can match two ways when there are only single bonds.

I hope this helps.

Best,
Sereina






 On 13 January 2014 21:02, JP jeanpaul.ebe...@inhibox.com wrote:
 
  Thanks All - I think I am in a good place now.
 
  I can get the SMILES from Paul's mmcif links and then I can use Sereina
 magic three lines to do what I want.  I'd cross my fingers - but with RDKit
 you don't need to.
  This works for all Chemical Components (or what other fashionable name
 they go by these days) in the PDB.
 
  For posterity: I have found a post in the mailing list started by James
 which sheds some light on this:
 
 https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03481.html
 
 
 
 
  On 13 January 2014 19:46, sereina riniker sereina.rini...@gmail.com
 wrote:
 
  Hi JP,
 
  If you have also a SMILES of the molecule you want to read from PDB,
 you can assign the bond orders based on this template:
 
  tmp = Chem.MolFromPDBFile(yourfilename)
  template = Chem.MolFromSmiles(yoursmiles)
  mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)
 
  Is this what you're looking for?
 
  Best,
  Sereina
 
 
  2014/1/13 JP jeanpaul.ebe...@inhibox.com
 
  RDKitters!
 
  Finally back on the mailing list!
 
  I am sure we've been through this at the UGM (my mind must have
 wandered off!), but a quick question about the PDB reader and bond
 perception.  Is this supported with the current PDB reader?  I remember
 that someone (PaulE, perhaps?) was saying bond perception was painful, but
 there was some dictionary for PDB ligands which helps (any idea the name of
 this dictionary?).
 
  To the technical details.
 
  I am reading in the following PDB file with a simple MolFromPDBFile()
 call:
 
  HETATM1  O1P 84T A1862 -27.016   9.387 -72.564  1.00 20.81
   O
  HETATM2  P   84T A1862 -27.282   9.818 -73.968  1.00 19.65
   P
  HETATM3  O2P 84T A1862 -27.881  11.176 -74.182  1.00 21.49
   O
  HETATM4  N   84T A1862 -25.869   9.583 -74.813  1.00 19.78
   N
  HETATM5  C   84T A1862 -25.759  10.010 -76.075  1.00 19.97
   C
  HETATM6  CA  84T A1862 -24.493   9.748 -76.807  1.00 19.75
   C
  HETATM7  CB  84T A1862 -24.794   8.678 -77.847  1.00 19.73
   C
  HETATM8  CG  84T A1862 -23.571   8.324 -78.681  1.00 19.70
   C
  HETATM9  CD2 84T A1862 -23.309   9.519 -79.611  1.00 18.49
   C
  HETATM   10  CD1 84T A1862 -23.863   6.932 -79.305  1.00 18.60
   C
  HETATM   11  OHB 84T A1862 -25.210   7.467 -77.223  1.00 19.17
   O
  HETATM   12  OH  84T A1862 -23.549   9.127 -75.984  1.00 20.33
   O
  HETATM   13  O   84T A1862 -26.672  10.517 -76.692  1.00 20.26
   O
  HETATM   14  O5' 84T A1862 -28.377   8.861 -74.619  1.00 19.39
   O
  HETATM   15  C5' 84T A1862 -28.002   7.536 -74.954  1.00 18.47
   C
  HETATM   16  C4' 84T A1862 -28.909   7.000 -76.012  1.00 18.24
   C
  HETATM   17  C3' 84T A1862 -28.901   7.826 -77.298  1.00 18.28
   C
  HETATM   18  C2' 84T A1862 -30.318   7.610 -77.768  1.00 18.69
   C
  HETATM   19  O2' 84T A1862 -30.789   8.641 -78.581  1.00 19.64
   O
  HETATM   20  O4' 84T A1862 -30.262   6.951 -75.529  1.00 18.80
   O
  HETATM   21  C1' 84T A1862 -31.152   7.470 -76.521  1.00 19.01
   C
  HETATM   22  N9  84T A1862 -31.753   8.732 -76.009  1.00 20.08
   N
  HETATM   23  C4  84T A1862 -33.033   9.013 -76.158  1.00 21.10
   C
  HETATM   24  N3  84T

Re: [Rdkit-discuss] PDB reader and bond perception

2014-01-13 Thread sereina riniker
Hi JP,

If you have also a SMILES of the molecule you want to read from PDB, you
can assign the bond orders based on this template:

tmp = Chem.MolFromPDBFile(yourfilename)
template = Chem.MolFromSmiles(yoursmiles)
mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)

Is this what you're looking for?

Best,
Sereina


2014/1/13 JP jeanpaul.ebe...@inhibox.com

 RDKitters!

 Finally back on the mailing list!

 I am sure we've been through this at the UGM (my mind must have wandered
 off!), but a quick question about the PDB reader and bond perception.  Is
 this supported with the current PDB reader?  I remember that someone
 (PaulE, perhaps?) was saying bond perception was painful, but there was
 some dictionary for PDB ligands which helps (any idea the name of this
 dictionary?).

 To the technical details.

 I am reading in the following PDB file with a simple MolFromPDBFile() call:

 HETATM1  O1P 84T A1862 -27.016   9.387 -72.564  1.00 20.81
   O
 HETATM2  P   84T A1862 -27.282   9.818 -73.968  1.00 19.65
   P
 HETATM3  O2P 84T A1862 -27.881  11.176 -74.182  1.00 21.49
   O
 HETATM4  N   84T A1862 -25.869   9.583 -74.813  1.00 19.78
   N
 HETATM5  C   84T A1862 -25.759  10.010 -76.075  1.00 19.97
   C
 HETATM6  CA  84T A1862 -24.493   9.748 -76.807  1.00 19.75
   C
 HETATM7  CB  84T A1862 -24.794   8.678 -77.847  1.00 19.73
   C
 HETATM8  CG  84T A1862 -23.571   8.324 -78.681  1.00 19.70
   C
 HETATM9  CD2 84T A1862 -23.309   9.519 -79.611  1.00 18.49
   C
 HETATM   10  CD1 84T A1862 -23.863   6.932 -79.305  1.00 18.60
   C
 HETATM   11  OHB 84T A1862 -25.210   7.467 -77.223  1.00 19.17
   O
 HETATM   12  OH  84T A1862 -23.549   9.127 -75.984  1.00 20.33
   O
 HETATM   13  O   84T A1862 -26.672  10.517 -76.692  1.00 20.26
   O
 HETATM   14  O5' 84T A1862 -28.377   8.861 -74.619  1.00 19.39
   O
 HETATM   15  C5' 84T A1862 -28.002   7.536 -74.954  1.00 18.47
   C
 HETATM   16  C4' 84T A1862 -28.909   7.000 -76.012  1.00 18.24
   C
 HETATM   17  C3' 84T A1862 -28.901   7.826 -77.298  1.00 18.28
   C
 HETATM   18  C2' 84T A1862 -30.318   7.610 -77.768  1.00 18.69
   C
 HETATM   19  O2' 84T A1862 -30.789   8.641 -78.581  1.00 19.64
   O
 HETATM   20  O4' 84T A1862 -30.262   6.951 -75.529  1.00 18.80
   O
 HETATM   21  C1' 84T A1862 -31.152   7.470 -76.521  1.00 19.01
   C
 HETATM   22  N9  84T A1862 -31.753   8.732 -76.009  1.00 20.08
   N
 HETATM   23  C4  84T A1862 -33.033   9.013 -76.158  1.00 21.10
   C
 HETATM   24  N3  84T A1862 -34.018   8.339 -76.786  1.00 21.58
   N
 HETATM   25  C2  84T A1862 -35.263   8.846 -76.830  1.00 21.95
   C
 HETATM   26  C8  84T A1862 -31.223   9.701 -75.291  1.00 20.27
   C
 HETATM   27  N7  84T A1862 -32.173  10.618 -75.019  1.00 21.28
 N
 HETATM   28  C5  84T A1862 -33.315  10.213 -75.563  1.00 21.81
   C
 HETATM   29  C6  84T A1862 -34.624  10.702 -75.627  1.00 22.85
   C
 HETATM   30  N1  84T A1862 -35.550  10.010 -76.285  1.00 22.44
   N
 HETATM   31  N6  84T A1862 -35.008  11.862 -75.052  1.00 23.86
   N
 TER
 END

 But I am losing all the double bond (and aromatic) information:

 m = Chem.MolFromPDBFile(sys.argv[1])
 print Chem.MolToSmiles(m)

 Gives me:

 CC(C)C(O)C(O)C(O)NP(O)(O)OCC1CC(O)C(N2CNC3C2NCNC3N)O1

 As usual, many thanks for your time,

 -
 Jean-Paul Ebejer
 Early Stage Researcher


 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today.

 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MolFromXYZ?

2013-11-04 Thread sereina riniker
Hi Michal,

Well, if you have your 3D coordinates as a PDB file, you can read them in
with the new PDB parser and assign the bond orders based on a template
(generated from the SMILES of your molecule):
tmp = Chem.MolFromPDBFile(yourfilename)
template = Chem.MolFromSmiles(yoursmiles)
mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)

I don't know if this is what you were looking for.

Best,
Sereina



2013/11/4 Michal Krompiec michal.kromp...@gmail.com

 Hello,
 Is it possible to construct a Mol (or EditableMol) object out of a
 list of 3D coordinates? I am trying to write a bridge between cclib
 and RDKit, and I need a function to convert 3D geometries to SDF.
 Thanks,
 Michal


 --
 Android is increasing in popularity, but the open development platform that
 developers love is also attractive to malware creators. Download this white
 paper to learn more about secure code signing practices that can help keep
 Android apps secure.
 http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Similarity map images save just bottom corner

2013-11-04 Thread sereina riniker
Hi Anthony,

It has something to do with the bounding boxes (they get scaled during the
map generation process).
Using bbox_inches='tight', however, worked for me, i.e.
image.savefig(out.png, bbox_inches='tight')

I hope this helps.

Best,
Sereina



2013/11/4 Anthony Bradley anthony.brad...@worc.ox.ac.uk

  Hi all,



 I’m having some difficulty trying to save the new similarity map images.
 (They’re super cool by the way!)



 If I do the following in a python shell:



 from rdkit import Chem

 from rdkit.Chem.Draw import SimilarityMaps

 image =
 SimilarityMaps.GetSimilarityMapFromWeights(Chem.MolFromSmiles(CCC),[1,2,3])

 # Just a dummy image

 image.savefig(out.png)

 # The outputted image is just the bottom left hand corner



 The image saved is cropped to the left hand corner (saved.png). It will
 render perfectly in an IPython notebook however. (ipython.png)



 Am I missing something here about saving Matplotlib images?



 Cheers,



 Anthony




 --
 Android is increasing in popularity, but the open development platform that
 developers love is also attractive to malware creators. Download this white
 paper to learn more about secure code signing practices that can help keep
 Android apps secure.
 http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Beta of Q3 2013 release available

2013-10-25 Thread sereina riniker
Hi James,

Regarding the AssignBondOrdersFromTemplate() method:
As far as I understood, the PDB reader assigns bond orders to the amino
acids in a protein, but if a ligand is present it puts all bonds of it to
SINGLE bonds as auto bond-type perception is not trivial (see Roger's
comments). However, usually one knows which ligand was crystallized (i.e.
the SMILES is available), so the AssignBondOrdersFromTemplate() method can
be used to set the bond orders based on the known ligand structure. This is
the idea of the method. Now, to your real-world application. I'm sorry but
I don't think I understand it completely. Do you want to set only the bond
orders of a specific substructure? Or would you like to give the function a
set of ligands and a set of templates and it figures out which template
belongs to which ligand and sets the bonds orders accordingly?

Best,
Sereina



2013/10/24 Greg Landrum greg.land...@gmail.com

 James,

 On Thu, Oct 24, 2013 at 7:27 PM, James Davidson 
 j.david...@vernalis.comwrote:

  Hi Greg (et al.),

 ** **

 Thanks for the beta!  I have been going through some of the
 recently-added functionality, and had a couple of questions regarding the
 PDB reading / writing.


 Thanks for the bug reports!

 **

 **1.   **Do I remember correctly that there was a proposal (from
 Roger) to add some auto bond-type perception to the PDB parser for ligands
 (or is that just wishful thinking!)?

 Roger will have to confirm this, but I believe he said something along the
 lines of that way lies madness.

 2.   **If not, I notice that there is an
 AssignBondOrdersFromTemplate() method – but the example in the doc-string
 only shows (I think) the case where the input PDB is just a single small
 molecule – so the matching is pretty easy!  I think a more real-World case
 is when one wants to set the bond orders for multiple ligands (HETATM
 residues) based on substructure matches – which will then return an atom
 index selection that can be used as a start point.  Is there any way to
 have the AssignBondOrdersFromTemplate() convenience function optionally
 accept a list of atom indexes to specify a substructure?

 Sereina? Is that doable?

 

 **3.   **Is there some explanation for what the ‘flavor’ option does
 for reading/writing PDB?

 I'm not sure about the reader. Roger, can you answer that?

 This is what's in the C++ for the PDBWriter:
 // PDBWriter support multiple flavors of PDB output
 // flavor  1 : Write MODEL/ENDMDL lines around each record
 // flavor  2 : Don't write any CONECT records
 // flavor  4 : Write CONECT records in both directions
 // flavor  8 : Don't use multiple CONECTs to encode bond order
 // flavor  16 : Write MASTER record
 // flavor  32 : Write TER record

 This is now in the docs for both the Python and C++ code.

 

 **4.   **Having read in a PDB file I see the correct atoms flagged
 as HETATM (from GetIsHeteroAtom()).  But when call Chem.MolToPDBBlock()
 these atoms get written as ATOM records…  Also, a Chem.MolToPDBFile()
 method would be nice for completeness / symmetry : )

 The HETATM thing was the result of a dumb copy and paste error from me.
 It's fixed.

 Re: Chem.MolToPDBFile()
 that's missing because there's no corresponding Chem.MolToMolFile()
 This is an odd oversight, which I've now fixed.

 

 **5.   **It seems to me that GetResidueNumber() and
 GetSerialNumber() may have got mixed-up at some point(?).  At least, when I
 call GetSerialNumber() I see what appears to be the residue number; and
 when I call GetResidueNumber() I get “0”!

 This was another dumb bug from me. It's fixed.

 

 **6.   **I also seem to be seeing all of the bonds (for all
 residues) being written out in CONECT records – such that they all appear
 as single bonds in eg PyMOL – is this expected behaviour at the moment?

 Another one for Roger.

 -greg



 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss

Re: [Rdkit-discuss] Beta of Q3 2013 release available

2013-10-25 Thread sereina riniker
Hi James,

Okay, now it's clear. I somehow (wrongly) thought the PDB reader would give
you the protein and the ligand as two molecules and then it wouldn't have
been a problem... I will discuss with Greg on how to best do this and get
back to you.

Best,
Sereina


2013/10/25 James Davidson j.david...@vernalis.com

 Hi Sereina,

 Sereina wrote:
  Regarding the AssignBondOrdersFromTemplate() method:
  As far as I understood, the PDB reader assigns bond orders to the amino
 acids in a protein, but if a ligand is present it puts all bonds of it to
 SINGLE bonds as auto bond-type perception is not trivial (see Roger's
 comments).
  However, usually one knows which ligand was crystallized (i.e. the
 SMILES is available), so the AssignBondOrdersFromTemplate() method can be
 used to set the bond orders based on the known ligand structure.
  This is the idea of the method. Now, to your real-world application. I'm
 sorry but I don't think I understand it completely. Do you want to set only
 the bond orders of a specific substructure?
  Or would you like to give the function a set of ligands and a set of
 templates and it figures out which template belongs to which ligand and
 sets the bonds orders accordingly?

 This is very likely to be me being stupid - so please bear with me!
 If I read in a complex (pdb), and already have my reference ligand (lig),
 then AllChem.AssignBondOrdersFromTemplate(lig, pdb) fails because the
 reference ligand has not been matched to the ligand in the pdb 'complex'
 (dot-separated list of molecules).
 The doc-string states that the method works on two molecules - but I want
 to work on a reference molecule (lig) and a *substructure* of the
 macromolecule (pdb).  How should I be getting the bound ligand out as a
 molecule object to then use the AssignBondOrdersFromTemplate() method?  Am
 I missing some new PDB-related methods, or have I forgotten some
 fundamental RDKit methods for dealing with multi-component molecules?

 I guess a sensible process would be:
 1. Identify any HETATM residues
 2. For each residue (or at least those that have bonds!) extract or copy
 the mol (unless it can be addressed 'in place'?)
 3. Use AssignBondOrdersFromTemplate() - relying on lookup be eg residue
 name, etc
 4. Insert the molecule back into the complex (or update the info if it has
 been modified 'in place')

 Is this how the method is intended to be used with complexes (and if so,
 do you have an example for steps 2 and 4?

 Thanks

 James

 __
 PLEASE READ: This email is confidential and may be privileged. It is
 intended for the named addressee(s) only and access to it by anyone else is
 unauthorised. If you are not an addressee, any disclosure or copying of the
 contents of this email or any action taken (or not taken) in reliance on it
 is unauthorised and may be unlawful. If you have received this email in
 error, please notify the sender or postmas...@vernalis.com. Email is not
 a secure method of communication and the Company cannot accept
 responsibility for the accuracy or completeness of this message or any
 attachment(s). Please check this email for virus infection for which the
 Company accepts no responsibility. If verification of this email is sought
 then please request a hard copy. Unless otherwise stated, any views or
 opinions presented are solely those of the author and do not represent
 those of the Company.

 The Vernalis Group of Companies
 100 Berkshire Place
 Wharfedale Road
 Winnersh, Berkshire
 RG41 5RD, England
 Tel: +44 (0)118 938 

 To access trading company registration and address details, please go to
 the Vernalis website at www.vernalis.com and click on the Company
 address and registration details link at the bottom of the page..
 __

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MMFF Problem

2013-10-08 Thread sereina riniker
By the way, UFF also hangs if too many hydrogens with zero coordinates are
present, so it seems to be a general problem. This will be fixed at some
point.

Best,
Sereina


2013/10/8 Nicholas Firth nicholas.fi...@icr.ac.uk

 Hi,

 Thanks Sereina, that's exactly what I'm after! I thought there must be a
 way to do that but didn't look at the function definition. Next time I'll
 remember to check.

 Thanks Paolo as well.


 Best,
 Nick

 *Nicholas C. Firth* | PhD Student | Cancer Therapeutics
 The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton |
 Surrey | SM2 5NG

 *T* 020 8722 4033 | *E* nicholas.fi...@icr.ac.uk | *W* www.icr.ac.uk | *
 Twitter* @ICRnews https://twitter.com/ICRnews

 *Facebook* www.facebook.com/theinstituteofcancerresearch

 *Making the discoveries that defeat cancer*


 On 8 Oct 2013, at 10:51, sereina riniker sereina.rini...@gmail.com
 wrote:

 Hi Nick (and Paolo),

 It's not the molecules that trigger the infinite loop (I tried it with one
 of mine and it hanged as well). The problem is that no coordinates are
 added for the hydrogens in the script. If you set addCoords=True when
 adding the hydrogens, the minimization works (at least it did with me...).

 for mol in suppl:
 molList.append(Chem.AddHs(mol, addCoords=True))

 Best,
 Sereina



 2013/10/8 Paolo Tosco paolo.to...@unito.it

  Hi Nick,

 would you mind sending me the SD file which is triggering the infinite
 loop? Then I'll come back to you as soon as I find out something.

 Best,
 p.



 On 10/08/2013 11:36 AM, Nicholas Firth wrote:

 Hi RDKitters,

  I'm having an issue using the MMFF to minimise a CORINA conformation.
 I've written a little script which adds hydrogens to a molecule then tries
 to use the MMFF forcefield to minimise the conformer. The problem is that
 the script hangs on the minimise step.

  This error only occurs when I add hydrogens to the conformation, I
 assume the reason for this is because the hydrogens are all added at the
 origin. Is there a way of getting round this (in RDKit, as I want to keep
 the AddHs function)?

  I've included the script below, it will work like this however if you
 switch wrong the commented lines in the first for loop then the it no
 longer works.


   from rdkit import Chem
 from rdkit.Chem import AllChem
 from rdkit.Chem import ChemicalForceFields
 from sys import argv

  suppl = Chem.SDMolSupplier(argv[1])
 molList = []

  for mol in suppl:
 #molList.append(Chem.AddHs(mol))
 molList.append(mol)
 del suppl

  #w = Chem.SDWriter(argv[1])
 w = Chem.SDWriter('test.sdf')

  for mol in molList:
 mp = ChemicalForceFields.MMFFGetMoleculeProperties(mol)
 field = AllChem.MMFFGetMoleculeForceField(mol, mp)
 field.Minimize()
 w.write(mol)

  w.close()


 Thanks in advance.

  Best,
 Nick

  *Nicholas C. Firth* | PhD Student | Cancer Therapeutics
  The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton
 | Surrey | SM2 5NG

 *T* 020 8722 4033 | *E* nicholas.fi...@icr.ac.uk | *W* www.icr.ac.uk | *
 Twitter* @ICRnews https://twitter.com/ICRnews

 *Facebook* www.facebook.com/theinstituteofcancerresearch

 *Making the discoveries that defeat cancer*

 ATT1.gif


 The Institute of Cancer Research: Royal Cancer Hospital, a charitable
 Company Limited by Guarantee, Registered in England under Company No.
 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.

 This e-mail message is confidential and for use by the addressee only. If
 the message is received by anyone other than the addressee, please return
 the message to the sender by replying to it and then delete the message
 from your computer and network.


 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk



 ___
 Rdkit-discuss mailing 
 listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss



 --
 ==
 Paolo Tosco, Ph.D.
 Department of Drug Science and Technology
 Via Pietro Giuria, 9 - 10125 Torino (Italy)
 Tel: +39 011 670 7680 | Mob: +39 348 5537206
 Fax: +39 011 670 7687 | E-mail: paolo.tosco@unito.ithttp://open3dqsar.org | 
 http://open3dalign.org
 ==



 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors

[Rdkit-discuss] USR/USRCAT implementation in RDKit

2013-08-28 Thread sereina riniker
Dear all,

A c++ implementation and Python wrappers of the ultrafast shape recognition
(USR) descriptor (Ballester and Richards, J. Comput. Chem. (2007), 28,
1711) and the USR CREDO atom types (USRCAT) descriptor (Schreyer and
Blundell, J. Cheminf. (2012), 4, 27) are now available for the RDKit. The
code is based on the Python implementations of Jan Domanski and Adrian
Schreyer.

The descriptors can be accessed from Python via
rdkit.Chem.rdMolDescriptors.GetUSR and
rdkit.Chem.rdMolDescriptors.GetUSRCAT.

Best regards,
Sereina
--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Benchmarking platform

2013-07-05 Thread sereina riniker
Dear all,

The source code and compound lists of the benchmarking platform discussed
in J. Cheminf. 5, 26, 2013 (http://www.jcheminf.com/content/5/1/26) are now
available as a separate repository of RDKit on github
(rdkit/benchmarking_platform). The platform is based on RDKit and includes
88 data sets from three public data sources (MUV, DUD and ChEMBL) together
with precalculated training lists (i.e. indices of randomly selected
training molecules) for 5, 10 and 20 training actives.

I hope some of you find this interesting and if you have questions, please
don't hesitate to contact me.

Best regards,
Sereina
--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss