[Rdkit-discuss] double bond stereochemistry question

2017-08-11 Thread Tyler Backman
I am trying to write a series of dehydratase reactions with RDKit that
create specific double bond stereochemistry, but I have found that if
I install one double bond, it often reverses the stereochemistry on
other bonds, if they are present.

For example, I create the following molecule which currently has a cis
double bond:

### BEGIN EXAMPLE CODE BLOCK ###
import rdkit
chain = rdkit.Chem.MolFromSmiles('O=C([S])C[C@@H](O)/C=[C]\C')
### END EXAMPLE CODE BLOCK ###

Then I perform the following reaction:

### BEGIN EXAMPLE CODE BLOCK ###
rxn = 
rdkit.Chem.AllChem.ReactionFromSmarts(('[C:1][C:2]([O:3])[C:4][C:6](=[O:7])[S:8]>>'
  '[C:1]/[C:2]=[C:4]\[C:6](=[O:7])[S:8].[O:3]'))
prod = rxn.RunReactants((chain,))[0][0]
print(rdkit.Chem.MolToSmiles(prod, isomericSmiles=True))
### BEGIN EXAMPLE CODE BLOCK ###

This outputs the following molecule, which has the new cis double
bond, but the previous one has now reverted to trans:

C/[C]=C/C=C\C(=O)[S]

Any ideas? I would like to create a new cis double bond without
modifying the stereochemistry of any pre-existing double bonds in the
structure.

I am using RDKit Release_2017_03_3 on Python 3.4.2 under Debian 8.

Sincerely,

Tyler W. H. Backman
Postdoctoral Fellow
Lawrence Berkeley National Laboratory
Joint BioEnergy Institute
Agile BioFoundry

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit from source

2017-10-03 Thread Tyler Backman
This might also be of interest, I have a Dockerfile here which builds
RDKit, including the postgresql cartridge:
https://github.com/JBEI/clusterCAD/blob/master/debian-cheminformatics/Dockerfile

Sincerely,
Tyler Backman

On Tue, Oct 3, 2017 at 2:45 AM, Greg Landrum <greg.land...@gmail.com> wrote:
> Thanks for sharing that Kovas.
> I'm sure this will be helpful for people who don't want to/can't use
> anaconda.
>
> Best,
> -greg
>
>
> On Mon, Oct 2, 2017 at 7:34 PM, Kovas Palunas <kovas.palu...@arzeda.com>
> wrote:
>>
>> Hi all,
>>
>>
>> I thought I'd share a script I wrote to build RDKit and Boost together
>> which has worked for me on Linux (CentOS) and Mac machines so far.  I run
>> RDKit in a virtualenv Python environment (not in anaconda), so this may only
>> be helpful for a small group of RDKitters.  Hopefully some of you do find
>> this useful - it has personally saved me a lot of time getting RDKit
>> installed on multiple machines.
>>
>>
>> Note: please skim through the script to make sure you know what variables
>> inside it are set to what before running - there are multiple ways to
>> specify what code to build that may be useful for different purposes (and
>> some are commented out).
>>
>>
>> Make sure you pip install numpy before running (I should probably just add
>> this to the script).
>>
>>
>> Also, I have only tested this on RDKit 2016_09_3 and Boost 1_63_0.
>>
>>
>>  - Kovas
>>
>>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>



-- 
Tyler W. H. Backman
Postdoctoral Fellow
Lawrence Berkeley National Laboratory
Joint BioEnergy Institute
Agile BioFoundry

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] suggestions for comprehensive searchable database of natural products

2017-11-28 Thread Tyler Backman
Hi Jim,

MiBIG is a useful database of natural product gene clusters and
structures, which you can download in JSON format here, and use pretty
easily from within Python:
https://mibig.secondarymetabolites.org/repository.html This also
includes pathway and organism information.

Secondly, our ClusterCAD database is built with RDKit and Django, but
only includes Type I modular PKSs imported from MiBIG. You can use it
online at clustercad.jbei.org, or view the code and launch a docker
install locally from https://github.com/JBEI/clusterCAD. Internally,
it has a RDKit postgresql database, and includes predicted chemical
intermediates at each step of biosynthesis in addition to final
products. It is hand curated, to improve on the automatic AntiSMASH
annotations in MiBIG. I will gradually expand this to support a
greater diversity of natural products. I could send you an example
Jupyter notebook for using it programatically.

Sincerely,
Tyler

On Mon, Nov 27, 2017 at 1:30 PM, James T. Metz via Rdkit-discuss
 wrote:
> RDkit Discussion Group,
>
> My apologies in advance if my request is not appropriate for this
> discussion group.
>
> Given a small molecule that might have some resemblance to natural
> products,
> can someone suggest a free, comprehensive, PYTHON/RDkit searchable database
> of natural products that might be suitable for similarity and substructure
> searching.
>
> I am aware of a few websites that permit searching on the website. If
> possible,
> I would like to programmatically search by running a PYTHON/RDkit script on
> my
> local machine and then return the structures of related molecules to my
> local script.
>
> I would prefer not having to download and store a huge database.
>
> Also, if possible, it would be important to return the organism(s) that
> creates
> the natural product.  Pathway information would be also very, very helpful.
>
> I greatly welcome comments and suggestions.
>
> Thank you.
>
> Regards,
> Jim Metz
> Northwestern University
>
>
>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>



-- 
Tyler W. H. Backman
Postdoctoral Fellow
Lawrence Berkeley National Laboratory
Joint BioEnergy Institute

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Some compounds don't match themselves after converting to isomeric SMILES and back?

2018-06-07 Thread Tyler Backman
If I convert some compounds to SMILES and back, sometimes they no
longer match themselves with MCS if matchChiralTag=True, however the
actual number of each type of chiral tag is still the same per
Chem.FindMolChiralCenters(). Am I doing something wrong here?

This example uses the rapamycin structure from selleckchem:
http://file.selleckchem.com/downloads/product-sdf/rapamycin-sirolimus-s1039.SDF

### BEGIN CODE EXAMPLE ###
import rdkit.Chem as chem
from rdkit.Chem.rdFMCS import FindMCS

# read in the compound
mymol = chem.MolFromMolFile('rapamycin-sirolimus-s1039.SDF',
sanitize=True, removeHs=True, strictParsing=True)
print(mymol.GetNumAtoms())

# convert to smiles and back
mymol2 = chem.MolFromSmiles(chem.MolToSmiles(mymol, canonical=True,
isomericSmiles=True))
print(mymol2.GetNumAtoms())

# compare the converted molecule to the original with chirality
myMCS = FindMCS([mymol, mymol2], matchChiralTag=True, matchValences=True)
print(myMCS.numAtoms)

# compare the converted molecule to the original without chirality
myMCSNotChiral = FindMCS([mymol, mymol2], matchChiralTag=False,
matchValences=True)
print(myMCSNotChiral.numAtoms)
### END CODE EXAMPLE ###

All four of these should print 65, but the 3rd one comparing the
molecule to itself after converting to SMILES and back matches only 19
atoms:

### BEGIN OUTPUT ###
65
65
19
65
### END OUTPUT ###

I generated this example w/ rdkit 2018.03.2.0 on python 3.6.5
installed via Anaconda on OSX 10.12.6.

I also reproduced this with a version of Rapamycin I constructed
de-novo using SMARTS reactions for each enzyme. This one also had an
identical isomeric SMILES to the selleckchem one, but matched 39 out
of 65 atoms with itself, vs 19 out of 65 for the selleckchem
structure.

Sincerely,
Tyler Backman

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss