Re: [Rdkit-discuss] RDkit and Pubchem
Hi Jason, Thanks for the info. That's exactly what I want. I want to download the compound in any format (smiles/mol/sdf) for which I only have the Substance ID. Sundar On Fri, Dec 1, 2017 at 8:06 PM, Jason Biggs wrote: > Sundar, > What you do will depend on whether you have an SID or a CID number. Read > https://pubchemblog.ncbi.nlm.nih.gov/2014/06/19/what- > is-the-difference-between-a-substance-and-a-compound-in-pubchem/ for more > info. > > In PubChem terminology, a *substance* is a chemical sample description >> provided by a single source and a *compound* is a normalized chemical >> structure representation found in one or more contributed *substances*. > > > And looking at the pages for a few random substances, it doesn't list the > same kind of information that you'll find on a compound page. So what you > need is to get a list of associated compounds for a given substance ID. > > https://pubchem.ncbi.nlm.nih.gov/rest/pug/substance/sid/ > 123061/cids/JSON?cids_type=all > > Leave off the cids_type=all if you only want one compound. For the SID in > your query, it doesn't even have a compound, so it returns a message > stating so. > > Jason > > Jason Biggs > > > On Fri, Dec 1, 2017 at 5:33 PM, Sundar wrote: > >> Hi Jason, >> >> This is great. I would really benefit from this. >> At present I am looking for a way to download smiles or mol data of a few >> compound which only have SIDs and CIDs. >> Can we do it? I failed after trying the following, >> >> https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/sid/14420 >> 5334/property/CanonicalSMILES,IsomericSMILES,InChI/JSON >> >> Thanks, >> >> >> On Fri, Dec 1, 2017 at 1:11 PM, Jason Biggs >> wrote: >> >>> Pubchem has an easy to use rest API, described here: >>> https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest >>> >>> If you have a compound ID, you can query properties via something >>> >>> https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/ >>> property/CanonicalSMILES,IsomericSMILES,InChI/JSON >>> >>> >>> It comes back in JSON format, but you can have it return XML or plain >>> text. >>> >>> If you want an SDF file, something like >>> >>> https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/ >>> SDF?record_type=3d >>> >>> setting up a python function to query this shouldn't be difficult. >>> >>> Jason Biggs >>> >>> >>> On Fri, Dec 1, 2017 at 12:51 PM, Sundar >>> wrote: >>> I would like to download at least SMILES (great if I can also download mol files). And the same is true for Pubchem Compound ID or using Substance ID. Or even download the whole data set using an assay id. Anything could help. Thanks, Jubi On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon wrote: > In what way? Given a single PubChem compound or substance ID you just > want to pull the smiles or molfile into RDKit? > > Tim > On 01/12/17 17:26, Sundar wrote: > > Hi RDkit users, > > I was wondering if RDkit has a means of downloading compounds from > Pubchem. > Also let me other ways that helps here. > > Thanks, > Jubi > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > ___ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
Sundar, What you do will depend on whether you have an SID or a CID number. Read https://pubchemblog.ncbi.nlm.nih.gov/2014/06/19/what-is-the-difference-between-a-substance-and-a-compound-in-pubchem/ for more info. In PubChem terminology, a *substance* is a chemical sample description > provided by a single source and a *compound* is a normalized chemical > structure representation found in one or more contributed *substances*. And looking at the pages for a few random substances, it doesn't list the same kind of information that you'll find on a compound page. So what you need is to get a list of associated compounds for a given substance ID. https://pubchem.ncbi.nlm.nih.gov/rest/pug/substance/sid/123061/cids/JSON?cids_type=all Leave off the cids_type=all if you only want one compound. For the SID in your query, it doesn't even have a compound, so it returns a message stating so. Jason Jason Biggs On Fri, Dec 1, 2017 at 5:33 PM, Sundar wrote: > Hi Jason, > > This is great. I would really benefit from this. > At present I am looking for a way to download smiles or mol data of a few > compound which only have SIDs and CIDs. > Can we do it? I failed after trying the following, > > https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/sid/144205334/property/ > CanonicalSMILES,IsomericSMILES,InChI/JSON > > Thanks, > > > On Fri, Dec 1, 2017 at 1:11 PM, Jason Biggs wrote: > >> Pubchem has an easy to use rest API, described here: >> https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest >> >> If you have a compound ID, you can query properties via something >> >> https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/ >> property/CanonicalSMILES,IsomericSMILES,InChI/JSON >> >> >> It comes back in JSON format, but you can have it return XML or plain >> text. >> >> If you want an SDF file, something like >> >> https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/ >> SDF?record_type=3d >> >> setting up a python function to query this shouldn't be difficult. >> >> Jason Biggs >> >> >> On Fri, Dec 1, 2017 at 12:51 PM, Sundar wrote: >> >>> I would like to download at least SMILES (great if I can also download >>> mol files). >>> And the same is true for Pubchem Compound ID or using Substance ID. >>> Or even download the whole data set using an assay id. Anything could >>> help. >>> >>> Thanks, >>> Jubi >>> >>> On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon >>> wrote: >>> In what way? Given a single PubChem compound or substance ID you just want to pull the smiles or molfile into RDKit? Tim On 01/12/17 17:26, Sundar wrote: Hi RDkit users, I was wondering if RDkit has a means of downloading compounds from Pubchem. Also let me other ways that helps here. Thanks, Jubi -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
Hi Jason, This is great. I would really benefit from this. At present I am looking for a way to download smiles or mol data of a few compound which only have SIDs and CIDs. Can we do it? I failed after trying the following, https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/sid/144205334/property/CanonicalSMILES,IsomericSMILES,InChI/JSON Thanks, On Fri, Dec 1, 2017 at 1:11 PM, Jason Biggs wrote: > Pubchem has an easy to use rest API, described here: https://pubchemdocs. > ncbi.nlm.nih.gov/pug-rest > > If you have a compound ID, you can query properties via something > > https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/ > 2244/property/CanonicalSMILES,IsomericSMILES,InChI/JSON > > > It comes back in JSON format, but you can have it return XML or plain text. > > If you want an SDF file, something like > > https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/ > 2244/SDF?record_type=3d > > setting up a python function to query this shouldn't be difficult. > > Jason Biggs > > > On Fri, Dec 1, 2017 at 12:51 PM, Sundar wrote: > >> I would like to download at least SMILES (great if I can also download >> mol files). >> And the same is true for Pubchem Compound ID or using Substance ID. >> Or even download the whole data set using an assay id. Anything could >> help. >> >> Thanks, >> Jubi >> >> On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon >> wrote: >> >>> In what way? Given a single PubChem compound or substance ID you just >>> want to pull the smiles or molfile into RDKit? >>> >>> Tim >>> On 01/12/17 17:26, Sundar wrote: >>> >>> Hi RDkit users, >>> >>> I was wondering if RDkit has a means of downloading compounds from >>> Pubchem. >>> Also let me other ways that helps here. >>> >>> Thanks, >>> Jubi >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> >>> >>> >>> ___ >>> Rdkit-discuss mailing >>> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
Hi Jubi, If you need the entire dataset and are not creating queries via the API, you can download all PubChem Data via ftp here: ftp://ftp.ncbi.nlm.nih.gov/pubchem/ Then download the SDFs, and extract out SMILES (I’ve used regular expressions that match the appropriate data tag with good success). Vin University of Alabama From: Jason Biggs [mailto:jasondbi...@gmail.com] Sent: Friday, December 1, 2017 1:12 PM To: Sundar Cc: RDKit Discuss Subject: Re: [Rdkit-discuss] RDkit and Pubchem Pubchem has an easy to use rest API, described here: https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest If you have a compound ID, you can query properties via something https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/property/CanonicalSMILES,IsomericSMILES,InChI/JSON It comes back in JSON format, but you can have it return XML or plain text. If you want an SDF file, something like https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/SDF?record_type=3d setting up a python function to query this shouldn't be difficult. Jason Biggs On Fri, Dec 1, 2017 at 12:51 PM, Sundar mailto:jubilantsun...@gmail.com>> wrote: I would like to download at least SMILES (great if I can also download mol files). And the same is true for Pubchem Compound ID or using Substance ID. Or even download the whole data set using an assay id. Anything could help. Thanks, Jubi On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon mailto:tdudgeon...@gmail.com>> wrote: In what way? Given a single PubChem compound or substance ID you just want to pull the smiles or molfile into RDKit? Tim On 01/12/17 17:26, Sundar wrote: Hi RDkit users, I was wondering if RDkit has a means of downloading compounds from Pubchem. Also let me other ways that helps here. Thanks, Jubi -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
Pubchem has an easy to use rest API, described here: https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest If you have a compound ID, you can query properties via something https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/property/CanonicalSMILES,IsomericSMILES,InChI/JSON It comes back in JSON format, but you can have it return XML or plain text. If you want an SDF file, something like https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/SDF?record_type=3d setting up a python function to query this shouldn't be difficult. Jason Biggs On Fri, Dec 1, 2017 at 12:51 PM, Sundar wrote: > I would like to download at least SMILES (great if I can also download mol > files). > And the same is true for Pubchem Compound ID or using Substance ID. > Or even download the whole data set using an assay id. Anything could help. > > Thanks, > Jubi > > On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon > wrote: > >> In what way? Given a single PubChem compound or substance ID you just >> want to pull the smiles or molfile into RDKit? >> >> Tim >> On 01/12/17 17:26, Sundar wrote: >> >> Hi RDkit users, >> >> I was wondering if RDkit has a means of downloading compounds from >> Pubchem. >> Also let me other ways that helps here. >> >> Thanks, >> Jubi >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> >> >> >> ___ >> Rdkit-discuss mailing >> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
Hi, If you would like get compounds from ChEMBL instead of PubChem you can use this Python client: https://github.com/chembl/chembl_webresource_client and get access to 1.7M+ unique compounds as molfiles, smiles, inchis, inch keys and images. Cheers, Michał On Fri, Dec 1, 2017 at 6:51 PM, Sundar wrote: > I would like to download at least SMILES (great if I can also download mol > files). > And the same is true for Pubchem Compound ID or using Substance ID. > Or even download the whole data set using an assay id. Anything could help. > > Thanks, > Jubi > > On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon wrote: >> >> In what way? Given a single PubChem compound or substance ID you just want >> to pull the smiles or molfile into RDKit? >> >> Tim >> >> On 01/12/17 17:26, Sundar wrote: >> >> Hi RDkit users, >> >> I was wondering if RDkit has a means of downloading compounds from >> Pubchem. >> Also let me other ways that helps here. >> >> Thanks, >> Jubi >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> >> >> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
I would like to download at least SMILES (great if I can also download mol files). And the same is true for Pubchem Compound ID or using Substance ID. Or even download the whole data set using an assay id. Anything could help. Thanks, Jubi On Fri, Dec 1, 2017 at 11:55 AM, Tim Dudgeon wrote: > In what way? Given a single PubChem compound or substance ID you just want > to pull the smiles or molfile into RDKit? > > Tim > On 01/12/17 17:26, Sundar wrote: > > Hi RDkit users, > > I was wondering if RDkit has a means of downloading compounds from Pubchem. > Also let me other ways that helps here. > > Thanks, > Jubi > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > ___ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
On 12/01/2017 11:55 AM, Tim Dudgeon wrote: > In what way? Given a single PubChem compound or substance ID you just > want to pull the smiles or molfile into RDKit? Furthermore what's your definition of "a compound"? If it includes stereochemistry, pubchem usually has 3d mol files, except where it doesn't. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDkit and Pubchem
In what way? Given a single PubChem compound or substance ID you just want to pull the smiles or molfile into RDKit? Tim On 01/12/17 17:26, Sundar wrote: Hi RDkit users, I was wondering if RDkit has a means of downloading compounds from Pubchem. Also let me other ways that helps here. Thanks, Jubi -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss