Re: [Rdkit-discuss] distinguishing macrocyclic molecules

2019-10-09 Thread Greg Landrum
Ivan's solution, using the SMARTS extension [r{12-}], is what I would use
for this.
I would suggest using the permanent documentation link though:
http://rdkit.org/docs/RDKit_Book.html#smarts-support-and-extensions
That's the one that I keep up to date.

A note to Dave's point about anthracene: because the "r" query is
defined in terms of smallest rings, things like anthracene aren't a problem
when using [r{9-}].
>>>
Chem.MolFromSmiles('C1=CC2=CC3=CC=CC=C3C=C2C=C1').HasSubstructMatch(Chem.MolFromSmarts('[r{9-}]'))

False

The next version of the documentation includes a SMARTS reference, so it'll
be easier to look stuff like this up:
https://github.com/rdkit/rdkit/blob/master/Docs/Book/RDKit_Book.rst#smarts-reference



On Wed, Oct 9, 2019 at 6:46 PM Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> Hi David,
>
> Thanks for the tip! I just found it in the documentation; the syntax is
> [r{12-20}]. See
> http://rdkit.org/docs_temp/RDKit_Book.html#smarts-support-and-extensions
>
> Note that this doesn't suffer from the hard-coded limitation I mentioned,
> and you can even specify open ranges such as [r{12-}].
>
> Ivan
>
>
> On Wed, Oct 9, 2019 at 12:35 PM David Cosgrove 
> wrote:
>
>> Hi Ivan,
>> There is an RDKit extension to SMARTS that allows something like
>> [r12-20]. I can’t check the exact syntax at the moment. You might want to
>> check that atoms are not in smaller rings as well, so as not to pull up
>> things like anthracene which might not be something you’d want to class as
>> a macrocycle.
>> Cheers,
>> Dave
>>
>> On Wed, 9 Oct 2019 at 14:39, Ivan Tubert-Brohman <
>> ivan.tubert-broh...@schrodinger.com> wrote:
>>
>>> Hi Thomas,
>>>
>>> I don't know of an RDKit function that directly recognizes macrocycles,
>>> but you could find the size of the largest ring this way:
>>>
>>> ri = mol.GetRingInfo()
>>> largest_ring_size = max((len(r) for r in ri.AtomRings()), default=0)
>>> if largest_ring_size > 12:
>>> ...
>>>
>>> You can also find if a molecule has a ring of a certain size using
>>> SMARTS, but only for rings up to size 20 at the moment (this is an
>>> RDKit-specific limit). For example, if you are happy with finding rings of
>>> size 12-20, you could use SMARTS [r12,r13,r14,r15,r16,r17,r18,r19,r20].
>>> It's ugly but can be handy if you already have SMARTS-based tools to reuse.
>>>
>>> Ivan
>>>
>>> On Wed, Oct 9, 2019 at 7:25 AM Thomas Evangelidis 
>>> wrote:
>>>
 Greetings,

 Is there an automatic way to distinguish the macrocyclic molecules
 within a large chemical library using RDKit? For example, according to this
 definition: Macrocycles are ring structures composed of at least twelve
 atoms in the central cyclic framework [1,2,3]. Maybe someone here has a
 better definition. Could anyone give me some hints on how to program this?

 I thank you in advance.
 Thomas

 1. Yudin AK (2015) Macrocycles: lessons from the distant past, recent
 developments, and future directions. Chem Sci 6:30–49.
 2. Marsault E, Peterson ML (2011) Macrocycles are great cycles:
 applications, opportunities, and challenges of synthetic macrocycles in
 drug discovery. J Med Chem 54:1961–2004.
 3. Heinis C (2014) Drug discovery: tools and rules for macrocycles. Nat
 Chem Biol 10:696–698.


 --

 ==

 Dr. Thomas Evangelidis

 Research Scientist

 IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
 Academy of Sciences
 , Prague, Czech
 Republic
   &
 CEITEC - Central European Institute of Technology
 , Brno, Czech Republic

 email: teva...@gmail.com, Twitter: tevangelidis
 , LinkedIn: Thomas Evangelidis
 

 website: https://sites.google.com/site/thomasevangelidishomepage/



 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> --
>> David Cosgrove
>> Freelance computational chemistry and chemoinformatics developer
>> http://cozchemix.co.uk
>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Inchi which flavour??

2019-10-09 Thread Mike Mazanetz
Hello Paolo,

 

Many thanks for the support – I think this would be very useful for anyone 
trying to catch messages generated in this fashion.

 

Really nice work.

 

Thanks,

mike

 

From: Paolo Tosco  
Sent: 09 October 2019 23:34
To: Mike Mazanetz 
Cc: 'RDKit Discuss' 
Subject: Re: [Rdkit-discuss] Inchi which flavour??

 

Hi Mike,

please find here a solution which I have just tested and works well on both 
Unix and Windows.

You need to redirect the C++ stderr stream with ctypes around the call whose 
output you wish to grab.

This can be done defining a context manager that uses ctypes:

import os
import sys
import datetime
import ctypes
import io
import tempfile
from contextlib import contextmanager

# Adapted from
# https://eli.thegreenplace.net/2015/redirecting-all-kinds-of-stdout-in-python/
if (sys.platform == "win32"):
kernel32 = ctypes.WinDLL("kernel32")
# https://docs.microsoft.com/en-us/windows/console/getstdhandle
C_STD_ERROR_HANDLE = -12
c_stderr = kernel32.GetStdHandle(C_STD_ERROR_HANDLE)
c_flush = kernel32.FlushFileBuffers
else:
libc = ctypes.CDLL(None)
c_stderr = ctypes.c_void_p.in_dll(libc, "stderr")
c_flush = libc.fflush

@contextmanager
def stderr_redirector(stream):
# The original fd stderr points to.
original_stderr_fd = sys.stderr.fileno()

def _redirect_stderr(to_fd):
"""Redirect stderr to the given file descriptor."""
# Flush the C-level buffer stderr
c_flush(c_stderr)
# Flush and close sys.stderr - also closes the file descriptor (fd)
sys.stderr.close()
# Make original_stderr_fd point to the same file as to_fd
os.dup2(to_fd, original_stderr_fd)
# Create a new sys.stderr that points to the redirected fd
sys.stderr = io.TextIOWrapper(os.fdopen(original_stderr_fd, 'wb'))

# Save a copy of the original stderr fd in saved_stderr_fd
saved_stderr_fd = os.dup(original_stderr_fd)
try:
# Create a temporary file and redirect stderr to it
tfile = tempfile.TemporaryFile(mode='w+b')
_redirect_stderr(tfile.fileno())
# Yield to caller, then redirect stderr back to the saved fd
yield
_redirect_stderr(saved_stderr_fd)
# Copy contents of temporary file to the given stream
tfile.flush()
tfile.seek(0, io.SEEK_SET)
stream.write(tfile.read())
finally:
tfile.close()
os.close(saved_stderr_fd)

Then, all you need to grab the RDKit warning printed to stderr is use the 
stderr_redirector() context manager around the relevant call, then check the 
grabbed output for relevant content.

For instance, in your example wrap the Chem.MolToInchi() call as follows:

  f = io.BytesIO()
  with stderr_redirector(f):
  InChi = Chem.MolToInchi(Chem.MolFromSmiles(y))
  grabbed_stderr = f.getvalue().decode('utf-8')
  if ("WARNING" in grabbed_stderr):
  print("caught: ", grabbed_stderr)

Cheers,
p.

On 09/10/2019 18:10, Mike Mazanetz wrote:

Hi,

 

Many thanks this, it is very helpful to see some code.

 

Yes, as it stands, I am yet to get warnings which are seen in stdout being sent 
to a file, only errors seem to find their way to my files.

Usually Warnings about stereochemistry don’t get captured.  Anyone see this, 
I’m guessing it’s the same for failed InChI’s too?

 

Thanks,

mike

 

 

From: Scalfani, Vincent    
Sent: 09 October 2019 14:40
To: Maciek Wójcikowski   ; 
Greg Landrum   
Cc: RDKit Discuss   

Subject: Re: [Rdkit-discuss] Inchi which flavour??

 

Hi Macjek and Mike, 

 

If I understand your question correctly, you can specify InChI option 
parameters when calculating InChIs. Here is an example:

 

m = Chem.MolFromSmiles('CCC1=CN=C(NC1=O)NC')

Chem.MolToInchi(m)

'InChI=1S/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)'

 

Now, try with one of the non-standard options such as FixedH:

 

Chem.MolToInchi(m,'/FixedH')

'InChI=1/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)/f/h8,10H'

 

To answer the question of what happens when the InChI calculation fails, I get 
an empty string.

 

m = 
Chem.MolFromSmiles('[C@H]1([C@H](C1C2[C@@H]([C@@H]2C(=O)O)C(=O)O)C(=O)O)C(=O)O')

Chem.MolToInchi(m)

'  '

 

There is also an InChI option that can warn on empty structures, and calculate 
an empty InChI, which I am assuming is supposed to be ‘InChI=1S//’, however, 
when trying this option I get the same result as above.

 

Chem.MolToInchi(m,'/WarnOnEmptyStructure')

'  '

 

I hope that helps. 

 

Vin

 

From: Maciek Wójcikowski mailto:mac...@wojcikowski.pl> 
> 
Sent: Wednesday, October 9, 2019 3:41 AM
To: Greg Landrum mailto:greg.land...@gmail.com> >
Cc: RDKit Discuss mailto:rdkit-discuss@lists.sourceforge.net> >
Subject: Re: 

Re: [Rdkit-discuss] Inchi which flavour??

2019-10-09 Thread Paolo Tosco

Hi Mike,

please find here a solution which I have just tested and works well on 
both Unix and Windows.


You need to redirect the C++ stderr stream with ctypes around the call 
whose output you wish to grab.


This can be done defining a context manager that uses ctypes:

import os
import sys
import datetime
import ctypes
import io
import tempfile
from contextlib import contextmanager

# Adapted from
# 
https://eli.thegreenplace.net/2015/redirecting-all-kinds-of-stdout-in-python/

if (sys.platform == "win32"):
    kernel32 = ctypes.WinDLL("kernel32")
    # https://docs.microsoft.com/en-us/windows/console/getstdhandle
    C_STD_ERROR_HANDLE = -12
    c_stderr = kernel32.GetStdHandle(C_STD_ERROR_HANDLE)
    c_flush = kernel32.FlushFileBuffers
else:
    libc = ctypes.CDLL(None)
    c_stderr = ctypes.c_void_p.in_dll(libc, "stderr")
    c_flush = libc.fflush

@contextmanager
def stderr_redirector(stream):
    # The original fd stderr points to.
    original_stderr_fd = sys.stderr.fileno()

    def _redirect_stderr(to_fd):
    """Redirect stderr to the given file descriptor."""
    # Flush the C-level buffer stderr
    c_flush(c_stderr)
    # Flush and close sys.stderr - also closes the file descriptor (fd)
    sys.stderr.close()
    # Make original_stderr_fd point to the same file as to_fd
    os.dup2(to_fd, original_stderr_fd)
    # Create a new sys.stderr that points to the redirected fd
    sys.stderr = io.TextIOWrapper(os.fdopen(original_stderr_fd, 'wb'))

    # Save a copy of the original stderr fd in saved_stderr_fd
    saved_stderr_fd = os.dup(original_stderr_fd)
    try:
    # Create a temporary file and redirect stderr to it
    tfile = tempfile.TemporaryFile(mode='w+b')
    _redirect_stderr(tfile.fileno())
    # Yield to caller, then redirect stderr back to the saved fd
    yield
    _redirect_stderr(saved_stderr_fd)
    # Copy contents of temporary file to the given stream
    tfile.flush()
    tfile.seek(0, io.SEEK_SET)
    stream.write(tfile.read())
    finally:
    tfile.close()
    os.close(saved_stderr_fd)

Then, all you need to grab the RDKit warning printed to stderr is use 
the stderr_redirector() context manager around the relevant call, then 
check the grabbed output for relevant content.


For instance, in your example wrap the Chem.MolToInchi() call as follows:

  f = io.BytesIO()
  with stderr_redirector(f):
  InChi = Chem.MolToInchi(Chem.MolFromSmiles(y))
  grabbed_stderr = f.getvalue().decode('utf-8')
  if ("WARNING" in grabbed_stderr):
  print("caught: ", grabbed_stderr)

Cheers,
p.

On 09/10/2019 18:10, Mike Mazanetz wrote:


Hi,

Many thanks this, it is very helpful to see some code.

Yes, as it stands, I am yet to get warnings which are seen in stdout 
being sent to a file, only errors seem to find their way to my files.


Usually Warnings about stereochemistry don’t get captured.  Anyone see 
this, I’m guessing it’s the same for failed InChI’s too?


Thanks,

mike

*From:*Scalfani, Vincent 
*Sent:* 09 October 2019 14:40
*To:* Maciek Wójcikowski ; Greg Landrum 


*Cc:* RDKit Discuss 
*Subject:* Re: [Rdkit-discuss] Inchi which flavour??

Hi Macjek and Mike,

If I understand your question correctly, you can specify InChI option 
parameters when calculating InChIs. Here is an example:


m = Chem.MolFromSmiles('CCC1=CN=C(NC1=O)NC')

Chem.MolToInchi(m)

'InChI=1S/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)'

Now, try with one of the non-standard options such as FixedH:

Chem.MolToInchi(m,'/FixedH')

'InChI=1/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)/f/h8,10H'

To answer the question of what happens when the InChI calculation 
fails, I get an empty string.


m = 
Chem.MolFromSmiles('[C@H]1([C@H](C1C2[C@@H]([C@@H]2C(=O)O)C(=O)O)C(=O)O)C(=O)O')


Chem.MolToInchi(m)

'  '

There is also an InChI option that can warn on empty structures, and 
calculate an empty InChI, which I am assuming is supposed to be 
‘InChI=1S//’, however, when trying this option I get the same result 
as above.


Chem.MolToInchi(m,'/WarnOnEmptyStructure')

'  '

I hope that helps.

Vin

*From:*Maciek Wójcikowski >

*Sent:* Wednesday, October 9, 2019 3:41 AM
*To:* Greg Landrum >
*Cc:* RDKit Discuss >

*Subject:* Re: [Rdkit-discuss] Inchi which flavour??

Mike,

On top of what Greg said what might be particularly useful is an 
options parameter where you can pass some non default params to InChI 
call.


śr., 9 paź 2019, 07:22 użytkownik Greg Landrum > napisał:


Hi Mike,

The InChI API itself is not exposed. The contents of the
module are in the documentation along with some explanations of
how to call it:


Re: [Rdkit-discuss] Inchi which flavour??

2019-10-09 Thread Paolo Tosco

Hi Mike,

as I promised I'll put together something for you to capture warnings; 
I'll try to get it done tonight.


p.


On 10/09/19 18:10, Mike Mazanetz wrote:


Hi,

Many thanks this, it is very helpful to see some code.

Yes, as it stands, I am yet to get warnings which are seen in stdout 
being sent to a file, only errors seem to find their way to my files.


Usually Warnings about stereochemistry don’t get captured.  Anyone see 
this, I’m guessing it’s the same for failed InChI’s too?


Thanks,

mike

*From:*Scalfani, Vincent 
*Sent:* 09 October 2019 14:40
*To:* Maciek Wójcikowski ; Greg Landrum 


*Cc:* RDKit Discuss 
*Subject:* Re: [Rdkit-discuss] Inchi which flavour??

Hi Macjek and Mike,

If I understand your question correctly, you can specify InChI option 
parameters when calculating InChIs. Here is an example:


m = Chem.MolFromSmiles('CCC1=CN=C(NC1=O)NC')

Chem.MolToInchi(m)

'InChI=1S/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)'

Now, try with one of the non-standard options such as FixedH:

Chem.MolToInchi(m,'/FixedH')

'InChI=1/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)/f/h8,10H'

To answer the question of what happens when the InChI calculation 
fails, I get an empty string.


m = 
Chem.MolFromSmiles('[C@H]1([C@H](C1C2[C@@H]([C@@H]2C(=O)O)C(=O)O)C(=O)O)C(=O)O')


Chem.MolToInchi(m)

'  '

There is also an InChI option that can warn on empty structures, and 
calculate an empty InChI, which I am assuming is supposed to be 
‘InChI=1S//’, however, when trying this option I get the same result 
as above.


Chem.MolToInchi(m,'/WarnOnEmptyStructure')

'  '

I hope that helps.

Vin

*From:*Maciek Wójcikowski >

*Sent:* Wednesday, October 9, 2019 3:41 AM
*To:* Greg Landrum >
*Cc:* RDKit Discuss >

*Subject:* Re: [Rdkit-discuss] Inchi which flavour??

Mike,

On top of what Greg said what might be particularly useful is an 
options parameter where you can pass some non default params to InChI 
call.


śr., 9 paź 2019, 07:22 użytkownik Greg Landrum > napisał:


Hi Mike,

The InChI API itself is not exposed. The contents of the
module are in the documentation along with some explanations of
how to call it:

http://rdkit.org/docs/source/rdkit.Chem.rdinchi.html

If something is missing there, please let us know.

-greg

On Tue, Oct 8, 2019 at 5:20 PM mailto:mi...@novadatasolutions.co.uk>> wrote:

Dear RdKit users,

I was reading the inchi module docs and I couldn't find
methods to call the InChI API.  Are these exposed in RDKit?

It says the default is the standard Inchi.  What happens when
this conversion fails?

Thanks,

Mike

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Inchi which flavour??

2019-10-09 Thread Mike Mazanetz
Hi,

 

Many thanks this, it is very helpful to see some code.

 

Yes, as it stands, I am yet to get warnings which are seen in stdout being sent 
to a file, only errors seem to find their way to my files.

Usually Warnings about stereochemistry don’t get captured.  Anyone see this, 
I’m guessing it’s the same for failed InChI’s too?

 

Thanks,

mike

 

 

From: Scalfani, Vincent  
Sent: 09 October 2019 14:40
To: Maciek Wójcikowski ; Greg Landrum 

Cc: RDKit Discuss 
Subject: Re: [Rdkit-discuss] Inchi which flavour??

 

Hi Macjek and Mike, 

 

If I understand your question correctly, you can specify InChI option 
parameters when calculating InChIs. Here is an example:

 

m = Chem.MolFromSmiles('CCC1=CN=C(NC1=O)NC')

Chem.MolToInchi(m)

'InChI=1S/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)'

 

Now, try with one of the non-standard options such as FixedH:

 

Chem.MolToInchi(m,'/FixedH')

'InChI=1/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)/f/h8,10H'

 

To answer the question of what happens when the InChI calculation fails, I get 
an empty string.

 

m = 
Chem.MolFromSmiles('[C@H]1([C@H](C1C2[C@@H]([C@@H]2C(=O)O)C(=O)O)C(=O)O)C(=O)O')

Chem.MolToInchi(m)

'  '

 

There is also an InChI option that can warn on empty structures, and calculate 
an empty InChI, which I am assuming is supposed to be ‘InChI=1S//’, however, 
when trying this option I get the same result as above.

 

Chem.MolToInchi(m,'/WarnOnEmptyStructure')

'  '

 

I hope that helps. 

 

Vin

 

From: Maciek Wójcikowski mailto:mac...@wojcikowski.pl> 
> 
Sent: Wednesday, October 9, 2019 3:41 AM
To: Greg Landrum mailto:greg.land...@gmail.com> >
Cc: RDKit Discuss mailto:rdkit-discuss@lists.sourceforge.net> >
Subject: Re: [Rdkit-discuss] Inchi which flavour??

 

Mike,

 

On top of what Greg said what might be particularly useful is an options 
parameter where you can pass some non default params to InChI call. 

 

śr., 9 paź 2019, 07:22 użytkownik Greg Landrum mailto:greg.land...@gmail.com> > napisał:

Hi Mike,

 

The InChI API itself is not exposed. The contents of the module are in the 
documentation along with some explanations of how to call it:

http://rdkit.org/docs/source/rdkit.Chem.rdinchi.html 

 

If something is missing there, please let us know.

-greg

 

 

On Tue, Oct 8, 2019 at 5:20 PM mailto:mi...@novadatasolutions.co.uk> > wrote:

Dear RdKit users,

I was reading the inchi module docs and I couldn't find methods to call the 
InChI API.  Are these exposed in RDKit?  

It says the default is the standard Inchi.  What happens when this conversion 
fails?

 

Thanks,

Mike

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net 
 
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distinguishing macrocyclic molecules

2019-10-09 Thread Ivan Tubert-Brohman
Hi David,

Thanks for the tip! I just found it in the documentation; the syntax is
[r{12-20}]. See
http://rdkit.org/docs_temp/RDKit_Book.html#smarts-support-and-extensions

Note that this doesn't suffer from the hard-coded limitation I mentioned,
and you can even specify open ranges such as [r{12-}].

Ivan


On Wed, Oct 9, 2019 at 12:35 PM David Cosgrove 
wrote:

> Hi Ivan,
> There is an RDKit extension to SMARTS that allows something like [r12-20].
> I can’t check the exact syntax at the moment. You might want to check that
> atoms are not in smaller rings as well, so as not to pull up things like
> anthracene which might not be something you’d want to class as a macrocycle.
> Cheers,
> Dave
>
> On Wed, 9 Oct 2019 at 14:39, Ivan Tubert-Brohman <
> ivan.tubert-broh...@schrodinger.com> wrote:
>
>> Hi Thomas,
>>
>> I don't know of an RDKit function that directly recognizes macrocycles,
>> but you could find the size of the largest ring this way:
>>
>> ri = mol.GetRingInfo()
>> largest_ring_size = max((len(r) for r in ri.AtomRings()), default=0)
>> if largest_ring_size > 12:
>> ...
>>
>> You can also find if a molecule has a ring of a certain size using
>> SMARTS, but only for rings up to size 20 at the moment (this is an
>> RDKit-specific limit). For example, if you are happy with finding rings of
>> size 12-20, you could use SMARTS [r12,r13,r14,r15,r16,r17,r18,r19,r20].
>> It's ugly but can be handy if you already have SMARTS-based tools to reuse.
>>
>> Ivan
>>
>> On Wed, Oct 9, 2019 at 7:25 AM Thomas Evangelidis 
>> wrote:
>>
>>> Greetings,
>>>
>>> Is there an automatic way to distinguish the macrocyclic molecules
>>> within a large chemical library using RDKit? For example, according to this
>>> definition: Macrocycles are ring structures composed of at least twelve
>>> atoms in the central cyclic framework [1,2,3]. Maybe someone here has a
>>> better definition. Could anyone give me some hints on how to program this?
>>>
>>> I thank you in advance.
>>> Thomas
>>>
>>> 1. Yudin AK (2015) Macrocycles: lessons from the distant past, recent
>>> developments, and future directions. Chem Sci 6:30–49.
>>> 2. Marsault E, Peterson ML (2011) Macrocycles are great cycles:
>>> applications, opportunities, and challenges of synthetic macrocycles in
>>> drug discovery. J Med Chem 54:1961–2004.
>>> 3. Heinis C (2014) Drug discovery: tools and rules for macrocycles. Nat
>>> Chem Biol 10:696–698.
>>>
>>>
>>> --
>>>
>>> ==
>>>
>>> Dr. Thomas Evangelidis
>>>
>>> Research Scientist
>>>
>>> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
>>> Academy of Sciences 
>>> , Prague, Czech Republic
>>>   &
>>> CEITEC - Central European Institute of Technology
>>> , Brno, Czech Republic
>>>
>>> email: teva...@gmail.com, Twitter: tevangelidis
>>> , LinkedIn: Thomas Evangelidis
>>> 
>>>
>>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>>
>>>
>>>
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> --
> David Cosgrove
> Freelance computational chemistry and chemoinformatics developer
> http://cozchemix.co.uk
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distinguishing macrocyclic molecules

2019-10-09 Thread David Cosgrove
Hi Ivan,
There is an RDKit extension to SMARTS that allows something like [r12-20].
I can’t check the exact syntax at the moment. You might want to check that
atoms are not in smaller rings as well, so as not to pull up things like
anthracene which might not be something you’d want to class as a macrocycle.
Cheers,
Dave

On Wed, 9 Oct 2019 at 14:39, Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> Hi Thomas,
>
> I don't know of an RDKit function that directly recognizes macrocycles,
> but you could find the size of the largest ring this way:
>
> ri = mol.GetRingInfo()
> largest_ring_size = max((len(r) for r in ri.AtomRings()), default=0)
> if largest_ring_size > 12:
> ...
>
> You can also find if a molecule has a ring of a certain size using SMARTS,
> but only for rings up to size 20 at the moment (this is an RDKit-specific
> limit). For example, if you are happy with finding rings of size 12-20, you
> could use SMARTS [r12,r13,r14,r15,r16,r17,r18,r19,r20]. It's ugly but can
> be handy if you already have SMARTS-based tools to reuse.
>
> Ivan
>
> On Wed, Oct 9, 2019 at 7:25 AM Thomas Evangelidis 
> wrote:
>
>> Greetings,
>>
>> Is there an automatic way to distinguish the macrocyclic molecules within
>> a large chemical library using RDKit? For example, according to this
>> definition: Macrocycles are ring structures composed of at least twelve
>> atoms in the central cyclic framework [1,2,3]. Maybe someone here has a
>> better definition. Could anyone give me some hints on how to program this?
>>
>> I thank you in advance.
>> Thomas
>>
>> 1. Yudin AK (2015) Macrocycles: lessons from the distant past, recent
>> developments, and future directions. Chem Sci 6:30–49.
>> 2. Marsault E, Peterson ML (2011) Macrocycles are great cycles:
>> applications, opportunities, and challenges of synthetic macrocycles in
>> drug discovery. J Med Chem 54:1961–2004.
>> 3. Heinis C (2014) Drug discovery: tools and rules for macrocycles. Nat
>> Chem Biol 10:696–698.
>>
>>
>> --
>>
>> ==
>>
>> Dr. Thomas Evangelidis
>>
>> Research Scientist
>>
>> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
>> Academy of Sciences 
>> , Prague, Czech Republic
>>   &
>> CEITEC - Central European Institute of Technology
>> , Brno, Czech Republic
>>
>> email: teva...@gmail.com, Twitter: tevangelidis
>> , LinkedIn: Thomas Evangelidis
>> 
>>
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
-- 
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Inchi which flavour??

2019-10-09 Thread Scalfani, Vincent
Hi Macjek and Mike,

If I understand your question correctly, you can specify InChI option 
parameters when calculating InChIs. Here is an example:

m = Chem.MolFromSmiles('CCC1=CN=C(NC1=O)NC')
Chem.MolToInchi(m)
'InChI=1S/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)'

Now, try with one of the non-standard options such as FixedH:

Chem.MolToInchi(m,'/FixedH')
'InChI=1/C7H11N3O/c1-3-5-4-9-7(8-2)10-6(5)11/h4H,3H2,1-2H3,(H2,8,9,10,11)/f/h8,10H'

To answer the question of what happens when the InChI calculation fails, I get 
an empty string.

m = 
Chem.MolFromSmiles('[C@H]1([C@H](C1C2[C@@H]([C@@H]2C(=O)O)C(=O)O)C(=O)O)C(=O)O')
Chem.MolToInchi(m)
'  '

There is also an InChI option that can warn on empty structures, and calculate 
an empty InChI, which I am assuming is supposed to be ‘InChI=1S//’, however, 
when trying this option I get the same result as above.

Chem.MolToInchi(m,'/WarnOnEmptyStructure')
'  '

I hope that helps.

Vin

From: Maciek Wójcikowski 
Sent: Wednesday, October 9, 2019 3:41 AM
To: Greg Landrum 
Cc: RDKit Discuss 
Subject: Re: [Rdkit-discuss] Inchi which flavour??

Mike,

On top of what Greg said what might be particularly useful is an options 
parameter where you can pass some non default params to InChI call.

śr., 9 paź 2019, 07:22 użytkownik Greg Landrum 
mailto:greg.land...@gmail.com>> napisał:
Hi Mike,

The InChI API itself is not exposed. The contents of the module are in the 
documentation along with some explanations of how to call it:
http://rdkit.org/docs/source/rdkit.Chem.rdinchi.html

If something is missing there, please let us know.
-greg


On Tue, Oct 8, 2019 at 5:20 PM 
mailto:mi...@novadatasolutions.co.uk>> wrote:
Dear RdKit users,
I was reading the inchi module docs and I couldn't find methods to call the 
InChI API.  Are these exposed in RDKit?
It says the default is the standard Inchi.  What happens when this conversion 
fails?

Thanks,
Mike
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distinguishing macrocyclic molecules

2019-10-09 Thread Omar H94
Dear Thomas,

I'm not aware of any RDKit function that distinguishes macrocycles .
However, based on the definition of 12 or more membered rings, you may use
the following function to get molecules that have rings with 12 or more
atoms from a list of molecules:

def GetMacrocycles(mols):
Macrocycles = [ ]
for mol in mols:
for RingAtoms in mol.GetRingInfo().AtomRings():
if len(RingAtoms) >= 12: Macrocycles.append(mol)
return Macrocycles

I hope this helps.

Best regards,
Omar

On Wed, Oct 9, 2019 at 2:25 PM Thomas Evangelidis  wrote:

> Greetings,
>
> Is there an automatic way to distinguish the macrocyclic molecules within
> a large chemical library using RDKit? For example, according to this
> definition: Macrocycles are ring structures composed of at least twelve
> atoms in the central cyclic framework [1,2,3]. Maybe someone here has a
> better definition. Could anyone give me some hints on how to program this?
>
> I thank you in advance.
> Thomas
>
> 1. Yudin AK (2015) Macrocycles: lessons from the distant past, recent
> developments, and future directions. Chem Sci 6:30–49.
> 2. Marsault E, Peterson ML (2011) Macrocycles are great cycles:
> applications, opportunities, and challenges of synthetic macrocycles in
> drug discovery. J Med Chem 54:1961–2004.
> 3. Heinis C (2014) Drug discovery: tools and rules for macrocycles. Nat
> Chem Biol 10:696–698.
>
>
> --
>
> ==
>
> Dr. Thomas Evangelidis
>
> Research Scientist
>
> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
> Academy of Sciences , 
> Prague,
> Czech Republic
>   &
> CEITEC - Central European Institute of Technology 
> , Brno, Czech Republic
>
> email: teva...@gmail.com, Twitter: tevangelidis
> , LinkedIn: Thomas Evangelidis
> 
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Inchi which flavour??

2019-10-09 Thread Maciek Wójcikowski
Mike,

On top of what Greg said what might be particularly useful is an options
parameter where you can pass some non default params to InChI call.

śr., 9 paź 2019, 07:22 użytkownik Greg Landrum 
napisał:

> Hi Mike,
>
> The InChI API itself is not exposed. The contents of the module are in the
> documentation along with some explanations of how to call it:
> http://rdkit.org/docs/source/rdkit.Chem.rdinchi.html
>
> If something is missing there, please let us know.
> -greg
>
>
> On Tue, Oct 8, 2019 at 5:20 PM  wrote:
>
>> Dear RdKit users,
>> I was reading the inchi module docs and I couldn't find methods to call
>> the InChI API.  Are these exposed in RDKit?
>> It says the default is the standard Inchi.  What happens when this
>> conversion fails?
>>
>> Thanks,
>> Mike
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss