Re: [Rdkit-discuss] https://en.wikipedia.org/wiki/Hansen_solubility_parameter

2016-12-08 Thread Brian Cole
Hi Dr. Guillaume, I played around with the ability to map a set of fragments to molecules a couple months ago. The result of my experiments are here: https://github.com/coleb/fragment_mapper You give it a set of molecules and fragments you would like to have mapped. It tries to find the smallest

Re: [Rdkit-discuss] Generating all stereochem possibilities from smile

2016-12-09 Thread Brian Cole
What is the trickiness and dangerousness of this API? And could we make an easy way to enumerate bond stereo? Thanks! On Fri, Dec 9, 2016 at 5:44 PM, Brian Cole <col...@gmail.com> wrote: > This has me quite curious now, how do we detect unspecified bond stereo > chemistry in RDKit?

Re: [Rdkit-discuss] Generating all stereochem possibilities from smile

2016-12-09 Thread Brian Cole
This has me quite curious now, how do we detect unspecified bond stereo chemistry in RDKit? m = Chem.MolFromSmiles("FC=CF") assert m.HasProp("_StereochemDone") for bond in m.GetBonds(): print(bond.GetBondDir(), bond.GetStereo()) Yields: (rdkit.Chem.rdchem.BondDir.NONE,

Re: [Rdkit-discuss] Bug in AllChem.EmbedMultipleConfs pruning?

2016-12-22 Thread Brian Cole
RMSD with auto-morph symmetries with hydrogens are crazy expensive to calculate. Symmetry should be on by default, but without hydrogens. Would even love to see the RMSD auto-morph symmetry code ignore trifluro type of groups too as they dramatically increase the cost of the computation with

[Rdkit-discuss] Preserving hydrogens necessary for imine cis/trans stereochemistry?

2017-05-17 Thread Brian Cole
Is there a recommended way in RDKit to preserve hydrogens necessary for representing cis/trans stereochemistry of imines? For example, given the attached SDF I need to maintain explicit hydrogens in the output SMILES string to maintain the imine cis/trans stereo-chemistry. mol =

Re: [Rdkit-discuss] Python code to merge tuples from a SMARTS match

2017-11-07 Thread Brian Cole
You can use Chem.CanonicalRankAtoms to de-duplicate the SMARTS matches based upon the atom symmetry like this: def count_unique_substructures(smiles, smarts): mol = Chem.MolFromSmiles(smiles) ranks = list(Chem.CanonicalRankAtoms(mol, breakTies=False)) pattern =

Re: [Rdkit-discuss] RDKit appears to be parsing SMILES stereochemistry differently

2017-11-09 Thread Brian Cole
Here's an example of why this is useful at maintaining molecular fragmentation inside your molecular representation: >>> from rdkit import Chem >>> smiles = 'F9.[C@]91(C)CCO1' >>> fluorine, core = smiles.split('.') >>> fluorine 'F9' >>> fragment = core.replace('9', '([*:9])') >>> fragment

Re: [Rdkit-discuss] RDKit appears to be parsing SMILES stereochemistry differently

2017-11-09 Thread Brian Cole
> > Somehow you got the code to generate a "9" for that ring closure, which is > not something that RDKit does naturally, so we are only seeing a step in > the larger part of your goal. > Certainly, but thousands of lines of Python doesn't fit in an email in an easily digestible way. :-) >

[Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?

2018-05-31 Thread Brian Cole
It appears like Postgres 9.6+ supports parallel queries now to accelerate slow queries: https://www.postgresql.org/docs/10/static/parallel-query.html Has anyone successfully got this to accelerate substructure queries with the RDKit Postgres cartridge? Thanks, Brian

Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?

2018-06-01 Thread Brian Cole
seemed fine. > The problem (and it's a sizable one) is that parallel queries don't use > the index. Until parallel scans using GIST indices work, I don't think this > is really going to help much. > > -greg > > > On Fri, Jun 1, 2018 at 12:04 AM Brian Cole wrote: > >> It

Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?

2018-06-01 Thread Brian Cole
they should. Does a ::mol query on the same table parallelize? If > it does but a ::qmol query does not maybe I forgot something in the SQL > function definitions > > On Fri, 1 Jun 2018 at 15:43, Brian Cole wrote: > >> Hi Greg, >> >> Are SMARTS searches with the

Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-16 Thread Brian Cole
+1 to the MolVS project as well. Perhaps an easy bite-size project is to incorporate the open source mae parser code into core RDKit: https://github.com/schrodinger/maeparser On Mon, Jan 15, 2018 at 9:08 PM, Francois BERENGER < beren...@bioreg.kyushu-u.ac.jp> wrote: > On 01/16/2018 05:51 AM,

Re: [Rdkit-discuss] Calculating the MOE vsa_acc descriptor using the rdkit (or other Open Source software)?

2018-02-19 Thread Brian Cole
Hi Richard, You can calculate the per-atom contributions to the surface area with _CalcLabuteASAContribs: http://www.rdkit.org/Python_Docs/rdkit.Chem.rdMolDescriptors-module.html#_CalcLabuteASAContribs If you have the MOE SMARTS for "pure hydrogen bond acceptors", the following is the Python I

Re: [Rdkit-discuss] conda build instructions for OSX?

2018-01-02 Thread Brian Cole
ll stuck on is how to build RDKit's master branch using conda. Changing `git_rev` in rdkit/meta.yaml didn't have the desired effect. -Brian On Wed, Dec 27, 2017 at 5:08 PM, Brian Cole <col...@gmail.com> wrote: > Trying to 'conda build rdkit' as described in the > https://github.com/rdki

[Rdkit-discuss] conda build instructions for OSX?

2017-12-27 Thread Brian Cole
Trying to 'conda build rdkit' as described in the https://github.com/rdkit/conda-rdkit README to no success. Are there any OSX 'conda build' instructions tucked away somewhere? It's currently failing on the cairo dependency: -- Checking for one of the modules 'cairo' CMake Error at

Re: [Rdkit-discuss] Chemical Formula to SMILES

2018-08-12 Thread Brian Cole
While Dr. Guillaume is correct, there are some ways to find known molecules given the formula by hacking InChI strings. For example just google search the formula with the InChI prefix, e.g., InChI=1S/C16H14O10.

Re: [Rdkit-discuss] descriptors beyond rotatable bond count and possible correlations with entropy

2018-09-01 Thread Brian Cole
Little late to the party, but here is an RDKit implementation of a contiguous rotatable bond count I wrote awhile ago: https://gist.github.com/coleb/4737a1dc77b5f5f8a7bbe4b23f39f2c4 Doesn't return the actual bonds like Paolo's does. But it does take into account amides, triple bonds, and terminal

Re: [Rdkit-discuss] Interest in a RDkit UGM in the USA midwest?

2018-04-10 Thread Brian Cole
I would be interested, but not sure we would have such a large draw in the Midwest as we would in Cambridge MA. Potential idea would be to schedule it around the SciPy Conference? https://scipy2018.scipy.org/ehome/index.php?eventid=299527; Was thinking about checking that out this year. -Brian

Re: [Rdkit-discuss] seg fault when importing Chem on OS-X 10.12

2018-04-16 Thread Brian Cole
ame #6: 0x7fff5fe23015 libdyld.dylib`start + 1 frame #7: 0x7fff5fe23015 libdyld.dylib`start + 1 (lldb) info threads On Mon, Apr 16, 2018 at 1:11 PM, Brian Cole <col...@gmail.com> wrote: > An issue like this was fixed in the past: https://github.com/ > rdkit/rdkit/comm

Re: [Rdkit-discuss] seg fault when importing Chem on OS-X 10.12

2018-04-16 Thread Brian Cole
working. -Brian On Mon, Apr 16, 2018 at 1:20 PM, Brian Cole <col...@gmail.com> wrote: > I can reproduce the problem, and the issue does appear to be different > than the previous issue. Reproducible with the following on OSX: > > $ conda create -c rdkit -n rdkit_2017 rdkit pyt

Re: [Rdkit-discuss] seg fault when importing Chem on OS-X 10.12

2018-04-16 Thread Brian Cole
An issue like this was fixed in the past: https://github.com/rdkit/rdkit/commit/009dd580527caa662de8bac5ad0c60f1e9bc90cd Will see if I can reproduce this. -Brian On Mon, Apr 16, 2018 at 12:09 PM, Patrick Walters wrote: > Hi All, > > I installed the latest RDKit using

[Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-20 Thread Brian Cole
Hi Chem-informaticians: I know it has been talked about in the community that fingerprints are not a way to obfuscate molecules for security, but I don't recall a paper actually demonstrating actual reverse engineering a fingerprint into a chemical structure. Does anyone know if such a paper

Re: [Rdkit-discuss] Any known papers on reverse engineering fingerprints into structures?

2018-04-23 Thread Brian Cole
Thanks Andrew, very interesting and useful script! Unfortunately it doesn't work on circular/ECFP-like fingerprints. It has the requirement that the fingerprint be a substructure fingerprint as you described. It seems the evolutionary/genetic algorithm approach is the current state-of-the-art for

[Rdkit-discuss] Do reactions need a useChirality flag?

2018-09-27 Thread Brian Cole
I'm trying to get a reaction SMARTS pattern to ignore chiral atoms and it doesn't appear straightforward. First, it appears RDKit doesn't support '!@' to indicate a non-chiral specified atom. I have to wrap this in a recursive SMARTS to get it to work. For example: In [2]: mol =

[Rdkit-discuss] Docs intentionally broken?

2018-11-05 Thread Brian Cole
My google search for 'rdkit python point3d' yielded the following as the top result: https://rdkit.org/docs/api/rdkit.Geometry.rdGeometry-module.html Which unfortunately now has a 404, page not found. Was this an intentional reorganization of the documentation? -Brian

Re: [Rdkit-discuss] Double Bond Stereochemistry in the RDKit

2018-12-04 Thread Brian Cole
Hi Kovas, For your use-case #2 should suffice, "set STEREOCIS/STEREOTRANS tags + manually set stereo atoms". This is what the EnumerateStereoisomers code does: https://github.com/rdkit/rdkit/blob/master/rdkit/Chem/EnumerateStereoisomers.py#L38 As to what is the 'ground truth', that is a more

Re: [Rdkit-discuss] Error parsing a MUTAG smiles

2020-03-04 Thread Brian Cole
Note, the location of the first opening parenthesis is different: >>> 'c1ccc2=NC3=CC(=CC=C3=c2c1)[N+](=O)[O-]'.find('(') 13 >>> 'c1ccc2=NC3=CC=C(C=C3=c2c1)[N+](=O)[O-]'.find('(') 15 So the SMILES are syntactically correct to represent 2 and 3 nitrocarbazole, though semantically weird as they're

Re: [Rdkit-discuss] rdkit-cartridge: Inserting new molecules

2020-10-26 Thread Brian Cole
Hi Thomas, It's possible to use TEMPORARY TABLE for this purpose in a single transaction. This is the scheme we use in order to convert the input application SMILES into a canonicalized RDKit SMILES. We keep the RDKit canonical SMILES around in the table for exact isomer look ups, but this lets

Re: [Rdkit-discuss] RDKit version in AWS Aurora?

2021-06-07 Thread Brian Cole
Landrum wrote: > Hi Brian, > > On Mon, Jun 7, 2021 at 4:36 AM Brian Cole wrote: > >> This is a bit more of a question for AWS themselves, though I believe the >> RDKit build for the Postgres extension can be improved as well. >> >> The AWS documentation

Re: [Rdkit-discuss] para-stereochemistry

2021-05-27 Thread Brian Cole
I always refer back to this graphic in Alberto Gobbi's "Handling of Tautomerism and Stereochemistry in Compound Registration" paper: https://pubs.acs.org/doi/10.1021/ci200330x [image: image.png] @Greg Landrum , I would interpret "para stereochemistry" as #3 in the above image. And "dependent

[Rdkit-discuss] RDKit version in AWS Aurora?

2021-06-06 Thread Brian Cole
This is a bit more of a question for AWS themselves, though I believe the RDKit build for the Postgres extension can be improved as well. The AWS documentation states, “RDKit extension version 3.8.”

Re: [Rdkit-discuss] validating stereochemistry

2021-09-27 Thread Brian Cole
Good Morning Tim, The RDKit EnumerateStereoisomers function accomplishes this through the ‘tryEmbedding’ flag: https://github.com/rdkit/rdkit/blob/d20e5cadc81bf6c7b4e590124866f178f2f2fe28/rdkit/Chem/EnumerateStereoisomers.py#L8 It attempts to generate a 3D conformer for the given stereo