Re: [Rdkit-discuss] distinguishing macrocyclic molecules

2019-10-09 Thread Greg Landrum
Ivan's solution, using the SMARTS extension [r{12-}], is what I would use
for this.
I would suggest using the permanent documentation link though:
http://rdkit.org/docs/RDKit_Book.html#smarts-support-and-extensions
That's the one that I keep up to date.

A note to Dave's point about anthracene: because the "r" query is
defined in terms of smallest rings, things like anthracene aren't a problem
when using [r{9-}].
>>>
Chem.MolFromSmiles('C1=CC2=CC3=CC=CC=C3C=C2C=C1').HasSubstructMatch(Chem.MolFromSmarts('[r{9-}]'))

False

The next version of the documentation includes a SMARTS reference, so it'll
be easier to look stuff like this up:
https://github.com/rdkit/rdkit/blob/master/Docs/Book/RDKit_Book.rst#smarts-reference



On Wed, Oct 9, 2019 at 6:46 PM Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> Hi David,
>
> Thanks for the tip! I just found it in the documentation; the syntax is
> [r{12-20}]. See
> http://rdkit.org/docs_temp/RDKit_Book.html#smarts-support-and-extensions
>
> Note that this doesn't suffer from the hard-coded limitation I mentioned,
> and you can even specify open ranges such as [r{12-}].
>
> Ivan
>
>
> On Wed, Oct 9, 2019 at 12:35 PM David Cosgrove 
> wrote:
>
>> Hi Ivan,
>> There is an RDKit extension to SMARTS that allows something like
>> [r12-20]. I can’t check the exact syntax at the moment. You might want to
>> check that atoms are not in smaller rings as well, so as not to pull up
>> things like anthracene which might not be something you’d want to class as
>> a macrocycle.
>> Cheers,
>> Dave
>>
>> On Wed, 9 Oct 2019 at 14:39, Ivan Tubert-Brohman <
>> ivan.tubert-broh...@schrodinger.com> wrote:
>>
>>> Hi Thomas,
>>>
>>> I don't know of an RDKit function that directly recognizes macrocycles,
>>> but you could find the size of the largest ring this way:
>>>
>>> ri = mol.GetRingInfo()
>>> largest_ring_size = max((len(r) for r in ri.AtomRings()), default=0)
>>> if largest_ring_size > 12:
>>> ...
>>>
>>> You can also find if a molecule has a ring of a certain size using
>>> SMARTS, but only for rings up to size 20 at the moment (this is an
>>> RDKit-specific limit). For example, if you are happy with finding rings of
>>> size 12-20, you could use SMARTS [r12,r13,r14,r15,r16,r17,r18,r19,r20].
>>> It's ugly but can be handy if you already have SMARTS-based tools to reuse.
>>>
>>> Ivan
>>>
>>> On Wed, Oct 9, 2019 at 7:25 AM Thomas Evangelidis 
>>> wrote:
>>>
 Greetings,

 Is there an automatic way to distinguish the macrocyclic molecules
 within a large chemical library using RDKit? For example, according to this
 definition: Macrocycles are ring structures composed of at least twelve
 atoms in the central cyclic framework [1,2,3]. Maybe someone here has a
 better definition. Could anyone give me some hints on how to program this?

 I thank you in advance.
 Thomas

 1. Yudin AK (2015) Macrocycles: lessons from the distant past, recent
 developments, and future directions. Chem Sci 6:30–49.
 2. Marsault E, Peterson ML (2011) Macrocycles are great cycles:
 applications, opportunities, and challenges of synthetic macrocycles in
 drug discovery. J Med Chem 54:1961–2004.
 3. Heinis C (2014) Drug discovery: tools and rules for macrocycles. Nat
 Chem Biol 10:696–698.


 --

 ==

 Dr. Thomas Evangelidis

 Research Scientist

 IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
 Academy of Sciences
 , Prague, Czech
 Republic
   &
 CEITEC - Central European Institute of Technology
 , Brno, Czech Republic

 email: teva...@gmail.com, Twitter: tevangelidis
 , LinkedIn: Thomas Evangelidis
 

 website: https://sites.google.com/site/thomasevangelidishomepage/



 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> --
>> David Cosgrove
>> Freelance computational chemistry and chemoinformatics developer
>> http://cozchemix.co.uk
>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distinguishing macrocyclic molecules

2019-10-09 Thread Ivan Tubert-Brohman
Hi David,

Thanks for the tip! I just found it in the documentation; the syntax is
[r{12-20}]. See
http://rdkit.org/docs_temp/RDKit_Book.html#smarts-support-and-extensions

Note that this doesn't suffer from the hard-coded limitation I mentioned,
and you can even specify open ranges such as [r{12-}].

Ivan


On Wed, Oct 9, 2019 at 12:35 PM David Cosgrove 
wrote:

> Hi Ivan,
> There is an RDKit extension to SMARTS that allows something like [r12-20].
> I can’t check the exact syntax at the moment. You might want to check that
> atoms are not in smaller rings as well, so as not to pull up things like
> anthracene which might not be something you’d want to class as a macrocycle.
> Cheers,
> Dave
>
> On Wed, 9 Oct 2019 at 14:39, Ivan Tubert-Brohman <
> ivan.tubert-broh...@schrodinger.com> wrote:
>
>> Hi Thomas,
>>
>> I don't know of an RDKit function that directly recognizes macrocycles,
>> but you could find the size of the largest ring this way:
>>
>> ri = mol.GetRingInfo()
>> largest_ring_size = max((len(r) for r in ri.AtomRings()), default=0)
>> if largest_ring_size > 12:
>> ...
>>
>> You can also find if a molecule has a ring of a certain size using
>> SMARTS, but only for rings up to size 20 at the moment (this is an
>> RDKit-specific limit). For example, if you are happy with finding rings of
>> size 12-20, you could use SMARTS [r12,r13,r14,r15,r16,r17,r18,r19,r20].
>> It's ugly but can be handy if you already have SMARTS-based tools to reuse.
>>
>> Ivan
>>
>> On Wed, Oct 9, 2019 at 7:25 AM Thomas Evangelidis 
>> wrote:
>>
>>> Greetings,
>>>
>>> Is there an automatic way to distinguish the macrocyclic molecules
>>> within a large chemical library using RDKit? For example, according to this
>>> definition: Macrocycles are ring structures composed of at least twelve
>>> atoms in the central cyclic framework [1,2,3]. Maybe someone here has a
>>> better definition. Could anyone give me some hints on how to program this?
>>>
>>> I thank you in advance.
>>> Thomas
>>>
>>> 1. Yudin AK (2015) Macrocycles: lessons from the distant past, recent
>>> developments, and future directions. Chem Sci 6:30–49.
>>> 2. Marsault E, Peterson ML (2011) Macrocycles are great cycles:
>>> applications, opportunities, and challenges of synthetic macrocycles in
>>> drug discovery. J Med Chem 54:1961–2004.
>>> 3. Heinis C (2014) Drug discovery: tools and rules for macrocycles. Nat
>>> Chem Biol 10:696–698.
>>>
>>>
>>> --
>>>
>>> ==
>>>
>>> Dr. Thomas Evangelidis
>>>
>>> Research Scientist
>>>
>>> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
>>> Academy of Sciences 
>>> , Prague, Czech Republic
>>>   &
>>> CEITEC - Central European Institute of Technology
>>> , Brno, Czech Republic
>>>
>>> email: teva...@gmail.com, Twitter: tevangelidis
>>> , LinkedIn: Thomas Evangelidis
>>> 
>>>
>>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>>
>>>
>>>
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> --
> David Cosgrove
> Freelance computational chemistry and chemoinformatics developer
> http://cozchemix.co.uk
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distinguishing macrocyclic molecules

2019-10-09 Thread David Cosgrove
Hi Ivan,
There is an RDKit extension to SMARTS that allows something like [r12-20].
I can’t check the exact syntax at the moment. You might want to check that
atoms are not in smaller rings as well, so as not to pull up things like
anthracene which might not be something you’d want to class as a macrocycle.
Cheers,
Dave

On Wed, 9 Oct 2019 at 14:39, Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> Hi Thomas,
>
> I don't know of an RDKit function that directly recognizes macrocycles,
> but you could find the size of the largest ring this way:
>
> ri = mol.GetRingInfo()
> largest_ring_size = max((len(r) for r in ri.AtomRings()), default=0)
> if largest_ring_size > 12:
> ...
>
> You can also find if a molecule has a ring of a certain size using SMARTS,
> but only for rings up to size 20 at the moment (this is an RDKit-specific
> limit). For example, if you are happy with finding rings of size 12-20, you
> could use SMARTS [r12,r13,r14,r15,r16,r17,r18,r19,r20]. It's ugly but can
> be handy if you already have SMARTS-based tools to reuse.
>
> Ivan
>
> On Wed, Oct 9, 2019 at 7:25 AM Thomas Evangelidis 
> wrote:
>
>> Greetings,
>>
>> Is there an automatic way to distinguish the macrocyclic molecules within
>> a large chemical library using RDKit? For example, according to this
>> definition: Macrocycles are ring structures composed of at least twelve
>> atoms in the central cyclic framework [1,2,3]. Maybe someone here has a
>> better definition. Could anyone give me some hints on how to program this?
>>
>> I thank you in advance.
>> Thomas
>>
>> 1. Yudin AK (2015) Macrocycles: lessons from the distant past, recent
>> developments, and future directions. Chem Sci 6:30–49.
>> 2. Marsault E, Peterson ML (2011) Macrocycles are great cycles:
>> applications, opportunities, and challenges of synthetic macrocycles in
>> drug discovery. J Med Chem 54:1961–2004.
>> 3. Heinis C (2014) Drug discovery: tools and rules for macrocycles. Nat
>> Chem Biol 10:696–698.
>>
>>
>> --
>>
>> ==
>>
>> Dr. Thomas Evangelidis
>>
>> Research Scientist
>>
>> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
>> Academy of Sciences 
>> , Prague, Czech Republic
>>   &
>> CEITEC - Central European Institute of Technology
>> , Brno, Czech Republic
>>
>> email: teva...@gmail.com, Twitter: tevangelidis
>> , LinkedIn: Thomas Evangelidis
>> 
>>
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
-- 
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distinguishing macrocyclic molecules

2019-10-09 Thread Omar H94
Dear Thomas,

I'm not aware of any RDKit function that distinguishes macrocycles .
However, based on the definition of 12 or more membered rings, you may use
the following function to get molecules that have rings with 12 or more
atoms from a list of molecules:

def GetMacrocycles(mols):
Macrocycles = [ ]
for mol in mols:
for RingAtoms in mol.GetRingInfo().AtomRings():
if len(RingAtoms) >= 12: Macrocycles.append(mol)
return Macrocycles

I hope this helps.

Best regards,
Omar

On Wed, Oct 9, 2019 at 2:25 PM Thomas Evangelidis  wrote:

> Greetings,
>
> Is there an automatic way to distinguish the macrocyclic molecules within
> a large chemical library using RDKit? For example, according to this
> definition: Macrocycles are ring structures composed of at least twelve
> atoms in the central cyclic framework [1,2,3]. Maybe someone here has a
> better definition. Could anyone give me some hints on how to program this?
>
> I thank you in advance.
> Thomas
>
> 1. Yudin AK (2015) Macrocycles: lessons from the distant past, recent
> developments, and future directions. Chem Sci 6:30–49.
> 2. Marsault E, Peterson ML (2011) Macrocycles are great cycles:
> applications, opportunities, and challenges of synthetic macrocycles in
> drug discovery. J Med Chem 54:1961–2004.
> 3. Heinis C (2014) Drug discovery: tools and rules for macrocycles. Nat
> Chem Biol 10:696–698.
>
>
> --
>
> ==
>
> Dr. Thomas Evangelidis
>
> Research Scientist
>
> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
> Academy of Sciences , 
> Prague,
> Czech Republic
>   &
> CEITEC - Central European Institute of Technology 
> , Brno, Czech Republic
>
> email: teva...@gmail.com, Twitter: tevangelidis
> , LinkedIn: Thomas Evangelidis
> 
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss