Re: [Rdkit-discuss] . Re: using rdkit to read in chembl23 1.7 million compounds

2017-08-07 Thread Bennion, Brian
Hello Peter,
Great, that just made me realize that I was not using my most recent conda 
environment version of RDkit.
I reread the 2D sdf file with the latest rdkit version and now only 31 
molecules are tossed out by the SDMolsupplier in RDKit.  51 compounds had 
errors when reading in the smiles strings.
Brian


From: Peter S. Shenkin [mailto:shen...@gmail.com]
Sent: Monday, August 07, 2017 14:26
To: Bennion, Brian 
Cc: Chris Swain ; rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] . Re: using rdkit to read in chembl23 1.7 million 
compounds

That molecule's SMILES is correctly rendered by RDKit, or at least by the 
version of RDKit behind Slack:

[Inline image 1]


-P.

On Mon, Aug 7, 2017 at 3:54 PM, Bennion, Brian 
mailto:benni...@llnl.gov>> wrote:

The carbocations are in small heterocyclic molecules. see CHEMBL3815233

Brian




From: Chris Swain mailto:sw...@mac.com>>
Sent: Monday, August 7, 2017 11:46:30 AM
To: 
rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net>
Subject: [Rdkit-discuss] . Re: using rdkit to read in chembl23 1.7 million 
compounds

I've not tried to read in ChEMBL but I have tried to process other large 
datasets e.g. ZINC. My impression was that problems arose with small 
heterocyclic systems, particularly if fused or containing multiple different 
heteroatoms. I did wonder if the different aromaticity models might be the 
issue.

Chris
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] . Re: using rdkit to read in chembl23 1.7 million compounds

2017-08-07 Thread Peter S. Shenkin
That molecule's SMILES is correctly rendered by RDKit, or at least by the
version of RDKit behind Slack:

[image: Inline image 1]


-P.

On Mon, Aug 7, 2017 at 3:54 PM, Bennion, Brian  wrote:

> The carbocations are in small heterocyclic molecules. see CHEMBL3815233
>
> Brian
>
>
> --
> *From:* Chris Swain 
> *Sent:* Monday, August 7, 2017 11:46:30 AM
> *To:* rdkit-discuss@lists.sourceforge.net
> *Subject:* [Rdkit-discuss] . Re: using rdkit to read in chembl23 1.7
> million compounds
>
> I've not tried to read in ChEMBL but I have tried to process other large
> datasets e.g. ZINC. My impression was that problems arose with small
> heterocyclic systems, particularly if fused or containing multiple
> different heteroatoms. I did wonder if the different aromaticity models
> might be the issue.
>
> Chris
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] . Re: using rdkit to read in chembl23 1.7 million compounds

2017-08-07 Thread Bennion, Brian
The carbocations are in small heterocyclic molecules. see CHEMBL3815233

Brian



From: Chris Swain 
Sent: Monday, August 7, 2017 11:46:30 AM
To: rdkit-discuss@lists.sourceforge.net
Subject: [Rdkit-discuss] . Re: using rdkit to read in chembl23 1.7 million 
compounds

I've not tried to read in ChEMBL but I have tried to process other large 
datasets e.g. ZINC. My impression was that problems arose with small 
heterocyclic systems, particularly if fused or containing multiple different 
heteroatoms. I did wonder if the different aromaticity models might be the 
issue.

Chris
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] . Re: using rdkit to read in chembl23 1.7 million compounds

2017-08-07 Thread Chris Swain
I've not tried to read in ChEMBL but I have tried to process other large 
datasets e.g. ZINC. My impression was that problems arose with small 
heterocyclic systems, particularly if fused or containing multiple different 
heteroatoms. I did wonder if the different aromaticity models might be the 
issue.

Chris
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss