Re: [Rdkit-discuss] list of failed chembl ids

2017-08-08 Thread Bennion, Brian
net> Subject: Re: [Rdkit-discuss] list of failed chembl ids On Aug 8, 2017, at 22:20, Peter S. Shenkin <shen...@gmail.com<mailto:shen...@gmail.com>> wrote: > But I would be curious to see the 51 CHEMBL SMILES that RDKit could not parse. As of ChEMBL 23, the followin

Re: [Rdkit-discuss] list of failed chembl ids

2017-08-08 Thread Andrew Dalke
On Aug 8, 2017, at 22:20, Peter S. Shenkin wrote: > But I would be curious to see the 51 CHEMBL SMILES that RDKit could not parse. As of ChEMBL 23, the following files are available: - the sdf.gz file - pre-computed RDKit Morgan fingerprints in fps.gz format - the

Re: [Rdkit-discuss] list of failed chembl ids

2017-08-08 Thread Peter S. Shenkin
I looked up a bunch of these. The ones I saw are ChEMBL activity records, not molecule records, so they do not contain structural data. But I would be curious to see the 51 CHEMBL SMILES that RDKit could not parse. -P. -P. On Tue, Aug 8, 2017 at 3:00 PM, Bennion, Brian

[Rdkit-discuss] list of failed chembl ids

2017-08-08 Thread Bennion, Brian
Hello, If anyone is interested, the list of chembl ids for compounds that had such crazy 2D sd files are listed below. Several are just different formulations of the same parent compound. 181880 450200 1198593 1201364 1977677 1992520 2146259 2146289 2146290 2299271 3182693 3184182 3187332