Hi Thomas,
It's possible to use TEMPORARY TABLE for this purpose in a single
transaction. This is the scheme we use in order to convert the input
application SMILES into a canonicalized RDKit SMILES. We keep the RDKit
canonical SMILES around in the table for exact isomer look ups, but this
lets us
I always refer back to this graphic in Alberto Gobbi's "Handling of
Tautomerism and Stereochemistry in Compound Registration" paper:
https://pubs.acs.org/doi/10.1021/ci200330x
[image: image.png]
@Greg Landrum , I would interpret "para
stereochemistry" as #3 in the above image. And "dependent ster
This is a bit more of a question for AWS themselves, though I believe the RDKit
build for the Postgres extension can be improved as well.
The AWS documentation states, “RDKit extension version 3.8.”
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Updates.20180305.htm
Landrum wrote:
> Hi Brian,
>
> On Mon, Jun 7, 2021 at 4:36 AM Brian Cole wrote:
>
>> This is a bit more of a question for AWS themselves, though I believe the
>> RDKit build for the Postgres extension can be improved as well.
>>
>> The AWS documentation
Good Morning Tim,
The RDKit EnumerateStereoisomers function accomplishes this through the
‘tryEmbedding’ flag:
https://github.com/rdkit/rdkit/blob/d20e5cadc81bf6c7b4e590124866f178f2f2fe28/rdkit/Chem/EnumerateStereoisomers.py#L8
It attempts to generate a 3D conformer for the given stereo config
I would 2nd the suggestion of continuing to push a JSON format forward that
natively supports multiple conformers.
I've never seen automatic recombination of an SDF work %100 of the time, it's
fraught with corner cases. It's also abysmally slow and takes a huge amount of
disk space.
-Bruce
>
Hi Dr. Guillaume,
I played around with the ability to map a set of fragments to molecules a
couple months ago. The result of my experiments are here:
https://github.com/coleb/fragment_mapper
You give it a set of molecules and fragments you would like to have mapped.
It tries to find the smallest
Any advice on getting RDKit to read in SDF files that use bond order '4' to
mark bonds as aromatic and don't have explicit hydrogen? For example,
imagine two fused heterocycles where the hydrogen isn't really known. I
have SDF files that just mark the bond orders as '4', aromatic, and don't
even tr
This has me quite curious now, how do we detect unspecified bond stereo
chemistry in RDKit?
m = Chem.MolFromSmiles("FC=CF")
assert m.HasProp("_StereochemDone")
for bond in m.GetBonds():
print(bond.GetBondDir(), bond.GetStereo())
Yields:
(rdkit.Chem.rdchem.BondDir.NONE, rdkit.Chem.rdchem.Bond
x27;t expose an "easy" way to do this. What is the
trickiness and dangerousness of this API? And could we make an easy way to
enumerate bond stereo?
Thanks!
On Fri, Dec 9, 2016 at 5:44 PM, Brian Cole wrote:
> This has me quite curious now, how do we detect unspecified
RMSD with auto-morph symmetries with hydrogens are crazy expensive to
calculate. Symmetry should be on by default, but without hydrogens. Would
even love to see the RMSD auto-morph symmetry code ignore trifluro type of
groups too as they dramatically increase the cost of the computation with
little
Is there a recommended way in RDKit to preserve hydrogens necessary for
representing cis/trans stereochemistry of imines?
For example, given the attached SDF I need to maintain explicit hydrogens
in the output SMILES string to maintain the imine cis/trans
stereo-chemistry.
mol = Chem.ForwardSDMol
You can use Chem.CanonicalRankAtoms to de-duplicate the SMARTS matches
based upon the atom symmetry like this:
def count_unique_substructures(smiles, smarts):
mol = Chem.MolFromSmiles(smiles)
ranks = list(Chem.CanonicalRankAtoms(mol, breakTies=False))
pattern = Chem.MolFromSmarts(smart
Hi Cheminformaticians,
This is an extreme subtlety in the interpretation of SMILES atom
stereochemistry and I think a bug in RDKit. Specifically, I think the
following SMILES should be the same molecule:
>>> rdkit.__version__
'2017.09.1'
>>> Chem.CanonSmiles('F[C@@]1(C)CCO1')
'C[C@]1(F)CCO1'
>>>
Here's an example of why this is useful at maintaining molecular
fragmentation inside your molecular representation:
>>> from rdkit import Chem
>>> smiles = 'F9.[C@]91(C)CCO1'
>>> fluorine, core = smiles.split('.')
>>> fluorine
'F9'
>>> fragment = core.replace('9', '([*:9])')
>>> fragment
'[C@]([*
>
> Somehow you got the code to generate a "9" for that ring closure, which is
> not something that RDKit does naturally, so we are only seeing a step in
> the larger part of your goal.
>
Certainly, but thousands of lines of Python doesn't fit in an email in an
easily digestible way. :-)
> Since
Trying to 'conda build rdkit' as described in the
https://github.com/rdkit/conda-rdkit README to no success. Are there any
OSX 'conda build' instructions tucked away somewhere?
It's currently failing on the cairo dependency:
-- Checking for one of the modules 'cairo'
CMake Error at
/Users/coleb/a
7; works.
Now the next trick I'm still stuck on is how to build RDKit's master branch
using conda. Changing `git_rev` in rdkit/meta.yaml didn't have the desired
effect.
-Brian
On Wed, Dec 27, 2017 at 5:08 PM, Brian Cole wrote:
> Trying to 'conda build rdkit' as descr
+1 to the MolVS project as well.
Perhaps an easy bite-size project is to incorporate the open source mae
parser code into core RDKit: https://github.com/schrodinger/maeparser
On Mon, Jan 15, 2018 at 9:08 PM, Francois BERENGER <
beren...@bioreg.kyushu-u.ac.jp> wrote:
> On 01/16/2018 05:51 AM, Ti
Hi Richard,
You can calculate the per-atom contributions to the surface area with
_CalcLabuteASAContribs:
http://www.rdkit.org/Python_Docs/rdkit.Chem.rdMolDescriptors-module.html#_CalcLabuteASAContribs
If you have the MOE SMARTS for "pure hydrogen bond acceptors", the
following is the Python I cu
I would be interested, but not sure we would have such a large draw in the
Midwest as we would in Cambridge MA.
Potential idea would be to schedule it around the SciPy Conference?
https://scipy2018.scipy.org/ehome/index.php?eventid=299527&;
Was thinking about checking that out this year.
-Brian
An issue like this was fixed in the past:
https://github.com/rdkit/rdkit/commit/009dd580527caa662de8bac5ad0c60f1e9bc90cd
Will see if I can reproduce this.
-Brian
On Mon, Apr 16, 2018 at 12:09 PM, Patrick Walters
wrote:
> Hi All,
>
> I installed the latest RDKit using conda
>
> conda create -c
96
frame #5: 0x00011301 python`main + 497
frame #6: 0x7fff5fe23015 libdyld.dylib`start + 1
frame #7: 0x7fff5fe23015 libdyld.dylib`start + 1
(lldb) info threads
On Mon, Apr 16, 2018 at 1:11 PM, Brian Cole wrote:
> An issue like this was fixed in the past: https://github.
2017
working.
-Brian
On Mon, Apr 16, 2018 at 1:20 PM, Brian Cole wrote:
> I can reproduce the problem, and the issue does appear to be different
> than the previous issue. Reproducible with the following on OSX:
>
> $ conda create -c rdkit -n rdkit_2017 rdkit python=3.5
&g
Hi Chem-informaticians:
I know it has been talked about in the community that fingerprints are not
a way to obfuscate molecules for security, but I don't recall a paper
actually demonstrating actual reverse engineering a fingerprint into a
chemical structure. Does anyone know if such a paper exist
Thanks Andrew, very interesting and useful script!
Unfortunately it doesn't work on circular/ECFP-like fingerprints. It has
the requirement that the fingerprint be a substructure fingerprint as you
described. It seems the evolutionary/genetic algorithm approach is the
current state-of-the-art for
It appears like Postgres 9.6+ supports parallel queries now to accelerate
slow queries:
https://www.postgresql.org/docs/10/static/parallel-query.html
Has anyone successfully got this to accelerate substructure queries with
the RDKit Postgres cartridge?
Thanks,
Brian
--
hings seemed fine.
> The problem (and it's a sizable one) is that parallel queries don't use
> the index. Until parallel scans using GIST indices work, I don't think this
> is really going to help much.
>
> -greg
>
>
> On Fri, Jun 1, 2018 at 12:04 AM Brian Cole wro
un 1, 2018 at 10:07 AM, Greg Landrum
wrote:
> I think they should. Does a ::mol query on the same table parallelize? If
> it does but a ::qmol query does not maybe I forgot something in the SQL
> function definitions
>
> On Fri, 1 Jun 2018 at 15:43, Brian Cole wrote:
>
>
While Dr. Guillaume is correct, there are some ways to find known molecules
given the formula by hacking InChI strings.
For example just google search the formula with the InChI prefix, e.g.,
InChI=1S/C16H14O10.
https://www.google.com/search?safe=off&rlz=1C5CHFA_enUS700US700&ei=4ltwW5yzLYvBjwS99L
Little late to the party, but here is an RDKit implementation of a
contiguous rotatable bond count I wrote awhile ago:
https://gist.github.com/coleb/4737a1dc77b5f5f8a7bbe4b23f39f2c4
Doesn't return the actual bonds like Paolo's does. But it does take into
account amides, triple bonds, and terminal
I'm trying to get a reaction SMARTS pattern to ignore chiral atoms and it
doesn't appear straightforward. First, it appears RDKit doesn't support
'!@' to indicate a non-chiral specified atom. I have to wrap this in a
recursive SMARTS to get it to work. For example:
In [2]: mol = Chem.MolFromSmiles
My google search for 'rdkit python point3d' yielded the following as the
top result:
https://rdkit.org/docs/api/rdkit.Geometry.rdGeometry-module.html
Which unfortunately now has a 404, page not found.
Was this an intentional reorganization of the documentation?
-Brian
__
Hi Kovas,
For your use-case #2 should suffice, "set STEREOCIS/STEREOTRANS tags +
manually set stereo atoms". This is what the EnumerateStereoisomers code
does:
https://github.com/rdkit/rdkit/blob/master/rdkit/Chem/EnumerateStereoisomers.py#L38
As to what is the 'ground truth', that is a more diff
Note, the location of the first opening parenthesis is different:
>>> 'c1ccc2=NC3=CC(=CC=C3=c2c1)[N+](=O)[O-]'.find('(')
13
>>> 'c1ccc2=NC3=CC=C(C=C3=c2c1)[N+](=O)[O-]'.find('(')
15
So the SMILES are syntactically correct to represent 2 and 3
nitrocarbazole, though semantically weird as they're a
35 matches
Mail list logo