[Rdkit-discuss] reducing space in image rendered using Draw.MolsToGridImage
How to reduce the space between each row of images while using Draw.MolsToGridImage to render an image? Currently the default space is unnecessarily big. -- Thanks, Sundar ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Diversity picker
Hi RDKit users, Is it possible to pick a subset of (diverse) compounds that have less than a particular Tanimoto coefficient (for eg. 0.7) from a larger set using RDKit. The current version of the Diverse Picker picks a diverse set based on a "number of compounds" instead of Tanimoto score. Thanks Jubilant -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Sanitizing SD file
Thanks everyone for your valuable inputs. Chem.SDMolSupplier('lig.sdf', sanitize=False) worked well at the moment for my compounds. I am not sure if I will have problem in calculating descriptors or further in my calculations. I will also try to turn off the strict property checking. Thanks, Sundar On Thu, Dec 14, 2017 at 1:14 AM, Greg Landrum <greg.land...@gmail.com> wrote: > > > On Thu, Dec 14, 2017 at 6:35 AM, Francois BERENGER < > beren...@bioreg.kyushu-u.ac.jp> wrote: > >> On 12/14/2017 02:10 PM, Greg Landrum wrote: >> > >> > On Thu, Dec 14, 2017 at 4:22 AM, Francois BERENGER >> > <beren...@bioreg.kyushu-u.ac.jp <mailto:beren...@bioreg.kyushu-u.ac.jp >> >> >> > wrote: >> > >> > On 12/14/2017 05:15 AM, Sundar wrote: >> > > Hi RDkit users, >> > > >> > > I encounter following sanitize issue while I was trying to load >> an SD >> > > file using >> > > Chem.SDMolSupplier('lig.sdf') >> > > >> > > Explicit valence for atom # 16 N, 4, is greater than permitted >> > > ERROR: Could not sanitize molecule ending on line 3145 >> > >> > I also encounter this exact error sometimes. >> > >> > Is there a way to tell rdkit to automatically correct this atom >> type? >> > >> > >> > The code currently only automatically corrects cases where it's really, >> > really obvious what the correction should be, like C-N(=O)=O -> >> > C-[N+](=O)[O-]. >> >> Where is this in the code? >> I might have a look one day. >> > > It's here: > https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/MolOps.cpp#L194 > > > >> > The philosophy taken in the RDKit is that it's better to have a bad >> > structure be rejected than it is to try and learn from it. >> > If you disagree with this, it is pretty easy to switch off the >> > sanitization checks and keep the bad molecules. >> >> I understand. I also guess unsanitized molecules would make some things >> crash, just later. > > > That depends. You can turn off the strict property checking: > > In [*2*]: m = Chem.MolFromSmiles('C1CCN1(C)C') > > [08:09:23] Explicit valence for atom # 3 N, 4, is greater than permitted > > > In [*3*]: m = Chem.MolFromSmiles('C1CCN1(C)C',sanitize=*False*) > > > In [*6*]: m.UpdatePropertyCache(strict=*False*) > > > In [*7*]: Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^ > Chem.SANITIZE_PROPERTIES) > > Out[*7*]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE > > > In [*8*]: Chem.MolToSmiles(m) > > Out[*8*]: 'CN1(C)CCC1' > > > or if you want to be more aggressive you can also turn off the cleanup > that "fixes" those odd structures: > > In [*9*]: m = Chem.MolFromSmiles('CCCN(=O)=O',sanitize=*False*) > > > In [*10*]: m.UpdatePropertyCache(strict=*False*) > > > In [*11*]: Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^ > Chem.SANITIZE_PROPERTIES^Chem.SANITIZE_CLEANUP) > > Out[*11*]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE > > > In [*12*]: Chem.MolToSmiles(m) > Out[*12*]: 'CCCN(=O)=O' > > In either case, many standard molecular operations should still work, > you'll just be operating on molecules with atoms in unreasonable valence > states. > > -greg > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
Hi Jason, Thanks for the info. That's exactly what I want. I want to download the compound in any format (smiles/mol/sdf) for which I only have the Substance ID. Sundar On Fri, Dec 1, 2017 at 8:06 PM, Jason Biggs <jasondbi...@gmail.com> wrote: > Sundar, > What you do will depend on whether you have an SID or a CID number. Read > https://pubchemblog.ncbi.nlm.nih.gov/2014/06/19/what- > is-the-difference-between-a-substance-and-a-compound-in-pubchem/ for more > info. > > In PubChem terminology, a *substance* is a chemical sample description >> provided by a single source and a *compound* is a normalized chemical >> structure representation found in one or more contributed *substances*. > > > And looking at the pages for a few random substances, it doesn't list the > same kind of information that you'll find on a compound page. So what you > need is to get a list of associated compounds for a given substance ID. > > https://pubchem.ncbi.nlm.nih.gov/rest/pug/substance/sid/ > 123061/cids/JSON?cids_type=all > > Leave off the cids_type=all if you only want one compound. For the SID in > your query, it doesn't even have a compound, so it returns a message > stating so. > > Jason > > Jason Biggs > > > On Fri, Dec 1, 2017 at 5:33 PM, Sundar <jubilantsun...@gmail.com> wrote: > >> Hi Jason, >> >> This is great. I would really benefit from this. >> At present I am looking for a way to download smiles or mol data of a few >> compound which only have SIDs and CIDs. >> Can we do it? I failed after trying the following, >> >> https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/sid/14420 >> 5334/property/CanonicalSMILES,IsomericSMILES,InChI/JSON >> >> Thanks, >> >> >> On Fri, Dec 1, 2017 at 1:11 PM, Jason Biggs <jasondbi...@gmail.com> >> wrote: >> >>> Pubchem has an easy to use rest API, described here: >>> https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest >>> >>> If you have a compound ID, you can query properties via something >>> >>> https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/ >>> property/CanonicalSMILES,IsomericSMILES,InChI/JSON >>> >>> >>> It comes back in JSON format, but you can have it return XML or plain >>> text. >>> >>> If you want an SDF file, something like >>> >>> https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/ >>> SDF?record_type=3d >>> >>> setting up a python function to query this shouldn't be difficult. >>> >>> Jason Biggs >>> >>> >>> On Fri, Dec 1, 2017 at 12:51 PM, Sundar <jubilantsun...@gmail.com> >>> wrote: >>> >>>> I would like to download at least SMILES (great if I can also download >>>> mol files). >>>> And the same is true for Pubchem Compound ID or using Substance ID. >>>> Or even download the whole data set using an assay id. Anything could >>>> help. >>>> >>>> Thanks, >>>> Jubi >>>> >>>> On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon <tdudgeon...@gmail.com> >>>> wrote: >>>> >>>>> In what way? Given a single PubChem compound or substance ID you just >>>>> want to pull the smiles or molfile into RDKit? >>>>> >>>>> Tim >>>>> On 01/12/17 17:26, Sundar wrote: >>>>> >>>>> Hi RDkit users, >>>>> >>>>> I was wondering if RDkit has a means of downloading compounds from >>>>> Pubchem. >>>>> Also let me other ways that helps here. >>>>> >>>>> Thanks, >>>>> Jubi >>>>> >>>>> >>>>> -- >>>>> Check out the vibrant tech community on one of the world's most >>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>>> >>>>> >>>>> >>>>> ___ >>>>> Rdkit-discuss mailing >>>>> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Check out the vibrant tech community on one of the world's most >>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>>> ___ >>>>> Rdkit-discuss mailing list >>>>> Rdkit-discuss@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>>>> >>>>> >>>> >>>> >>>> -- >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>> ___ >>>> Rdkit-discuss mailing list >>>> Rdkit-discuss@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>>> >>>> >>> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
Hi Jason, This is great. I would really benefit from this. At present I am looking for a way to download smiles or mol data of a few compound which only have SIDs and CIDs. Can we do it? I failed after trying the following, https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/sid/144205334/property/CanonicalSMILES,IsomericSMILES,InChI/JSON Thanks, On Fri, Dec 1, 2017 at 1:11 PM, Jason Biggs <jasondbi...@gmail.com> wrote: > Pubchem has an easy to use rest API, described here: https://pubchemdocs. > ncbi.nlm.nih.gov/pug-rest > > If you have a compound ID, you can query properties via something > > https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/ > 2244/property/CanonicalSMILES,IsomericSMILES,InChI/JSON > > > It comes back in JSON format, but you can have it return XML or plain text. > > If you want an SDF file, something like > > https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/ > 2244/SDF?record_type=3d > > setting up a python function to query this shouldn't be difficult. > > Jason Biggs > > > On Fri, Dec 1, 2017 at 12:51 PM, Sundar <jubilantsun...@gmail.com> wrote: > >> I would like to download at least SMILES (great if I can also download >> mol files). >> And the same is true for Pubchem Compound ID or using Substance ID. >> Or even download the whole data set using an assay id. Anything could >> help. >> >> Thanks, >> Jubi >> >> On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon <tdudgeon...@gmail.com> >> wrote: >> >>> In what way? Given a single PubChem compound or substance ID you just >>> want to pull the smiles or molfile into RDKit? >>> >>> Tim >>> On 01/12/17 17:26, Sundar wrote: >>> >>> Hi RDkit users, >>> >>> I was wondering if RDkit has a means of downloading compounds from >>> Pubchem. >>> Also let me other ways that helps here. >>> >>> Thanks, >>> Jubi >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> >>> >>> >>> ___ >>> Rdkit-discuss mailing >>> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
I would like to download at least SMILES (great if I can also download mol files). And the same is true for Pubchem Compound ID or using Substance ID. Or even download the whole data set using an assay id. Anything could help. Thanks, Jubi On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon <tdudgeon...@gmail.com> wrote: > In what way? Given a single PubChem compound or substance ID you just want > to pull the smiles or molfile into RDKit? > > Tim > On 01/12/17 17:26, Sundar wrote: > > Hi RDkit users, > > I was wondering if RDkit has a means of downloading compounds from Pubchem. > Also let me other ways that helps here. > > Thanks, > Jubi > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > ___ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] RDkit and Pubchem
Hi RDkit users, I was wondering if RDkit has a means of downloading compounds from Pubchem. Also let me other ways that helps here. Thanks, Jubi -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] couldn't import Chem from rdkit
Hi rdkit-discuss, I am new to RDKit. I installed it using the following, conda install -c rdkit rdkit conda install -c rdkit/label/attic rdkit conda install -c rdkit/label/beta rdkit conda install -c rdkit/label/testing rdkit *When I tried using it, I have following error.* In [1]: from __future__ import print_function In [2]: from rdkit import Chem --- ImportError Traceback (most recent call last) in () > 1 from rdkit import Chem /Scr/scr-test-sundar/Scr-sundar/sundar/Programs/anaconda/anaconda2017/lib/python2.7/site-packages/rdkit/Chem/__init__.py in () 16 17 """ ---> 18 from rdkit import rdBase 19 from rdkit import RDConfig 20 ImportError: libboost_python.so.1.56.0: cannot open shared object file: No such file or directory Can someone help me here to solve this? Thanks, Jubi -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss