Re: [Rdkit-discuss] want advice for good teaching data set

2018-08-29 Thread TJ O'Donnell
Hi Andrew ChEMBL 24 has compound properties in the table compound_properties. I think the alogp is computed using (Crippen) atom types and the acd_logp is uses ACD labs methods. TJ On Wed, Aug 29, 2018 at 5:52 AM Andrew Dalke wrote: > Hi all, > > I am starting to put together materials for

[Rdkit-discuss] want advice for good teaching data set

2018-08-29 Thread Andrew Dalke
Hi all, I am starting to put together materials for the Python/RDKit training course I'm giving just before the RDKit UGM next month. I would like to structure part of it around the SQLite release of the ChEMBL data set. More specifically, I plan to include examples of machine learning with

Re: [Rdkit-discuss] Capturing 3D Conformational Flexibility in a Single Descriptor

2018-08-29 Thread Ali Eftekhari
Hi Dr. Cooper, Thanks for your response and the suggestions. I added randomSeed=737 and I now get value of 14 for descriptor nConf20 for ZINC000290539224 molecule (although it is different than your paper [the value is 10] it does not change on each run). My concern now is on the general usage

Re: [Rdkit-discuss] Capturing 3D Conformational Flexibility in a Single Descriptor

2018-08-29 Thread Richard Cooper
Just to follow up with the details - here is the line in the script to change: conformers = AllChem.EmbedMultipleConfs (molecule,numConfs,pruneRmsThresh=0.5, numThreads =3) to conformers = AllChem.EmbedMultipleConfs (molecule,numConfs,pruneRmsThresh=0.5, numThreads =3, randomSeed=737 )

Re: [Rdkit-discuss] want advice for good teaching data set

2018-08-29 Thread Eloy FĂ©lix
Hi Andrew, If you want to build model I guess that what you want is to get experimental logp values. This should give you something to start with: select ACTIVITY_ID, MOLREGNO, STANDARD_VALUE, STANDARD_TYPE from ACTIVITIES where STANDARD_TYPE = 'LogP' and STANDARD_VALUE is not null and

[Rdkit-discuss] want advice for good teaching data set

2018-08-29 Thread JW Feng via Rdkit-discuss
Hi Andrew, What about building QSAR models to predict activity for a particular ChEMBL assay? This would allow you to discuss strength and limitations of QSAR models. Best, JW ___ JW Feng, Ph.D. Denali Therapeutics Inc. 151 Oyster Point Blvd, 2nd Floor, South San Francisco, CA

Re: [Rdkit-discuss] Capturing 3D Conformational Flexibility in a Single Descriptor

2018-08-29 Thread Ali Eftekhari
Thank you very much! This is really helpful! Ali On Wed, Aug 29, 2018 at 7:52 AM Richard Cooper < richardiancooper+rdkitdisc...@gmail.com> wrote: > I think it depends on what you need the descriptor for. If it were for > some kind of fingerprinting, the example implementation would be too

Re: [Rdkit-discuss] Capturing 3D Conformational Flexibility in a Single Descriptor

2018-08-29 Thread Richard Cooper
I think it depends on what you need the descriptor for. If it were for some kind of fingerprinting, the example implementation would be too noisy. We used it to estimate how many low energy conformations of a molecule might be present in a particular system - and it turned out that correlated well

Re: [Rdkit-discuss] Creating Mol Object From SD File

2018-08-29 Thread Dimitri Maziuk via Rdkit-discuss
On 08/29/2018 01:54 PM, Chris Murphy wrote: > Hi, > > I finally realized that when passing an sdf string to Chem.MolFromMolBlock, > the Mol object will not retain the properties from the sdf. Ugh. You're right. +1 for a MolFromSdfBlock() that doesn't lose the properties. > Also, it seems that

[Rdkit-discuss] Creating Mol Object From SD File

2018-08-29 Thread Chris Murphy
Hi, I finally realized that when passing an sdf string to Chem.MolFromMolBlock, the Mol object will not retain the properties from the sdf. Knowing that, I am wondering if there is a way to create a single Mol object from a SDF string. Right now, the only way I know is by using SDMolSupplier: