Re: [Rdkit-discuss] How to determine if atoms are part of the same ring?

2017-02-08 Thread Andrew Dalke
On Feb 8, 2017, at 19:22, Markus Metz  wrote:
> The question to you is: Is there another more elegant way of doing it? May be 
> I missed something from the python API?

I don't quite follow what you are looking for, though I have managed to 
condense your code somewhat, into:

updatedMapping = None
for ring in m.GetRingInfo().AtomRings():
if set(ring).issubset(maps):
updatedMapping = ring

if updatedMapping is not None:
updatedMapping = sorted(updatedMapping)
for i, atom_idx in enumerate(updatedMapping, 1):
m.GetAtomWithIdx(atom_idx).SetProp("molAtomMapNumber", str(i))


Is it that you do not want to number the "*" atoms? In that case you can ask 
the query structure for the atoms with atomic number 0:

>>> for atom in corea.GetAtoms():
...print(atom.GetAtomicNum())
...
0
6
6
6
6
6
6
0

and ignore numbering the atoms at those positions.

Or that you don't want to include ring atoms which aren't ring atoms in the 
query structure?

In which case you can ask the query structure for its rings:

>>> corea.GetRingInfo().AtomRings()
((1, 6, 5, 4, 3, 2),)

and use that to guide which atoms should/should not be numbered.

Cheers,

Andrew
da...@dalkescientific.com



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] How to determine if atoms are part of the same ring?

2017-02-08 Thread Markus Metz
Dear RDKit community:

I am facing the situation that I would like to attach indices to atoms in a
molecule defined by a substructure. The substitution pattern is of
importance. Therefore I needed to include the wildcards atoms in my
substructure. If I get the atom indices of this substructure, the matched
atoms are still included and will be used for attaching indices (see at the
end of the attached notebook for an example)

I found a solution how to get rid off the non ring atom. That was
straightforward. For the attached atom which is part of another ring I
needed to do something differently. I determined the atom indices in each
ring with GetRingInfo and then I determined which of these rings is part of
the tuple of the atom indices of the pattern. Once determined I can do what
I need.

The question to you is: Is there another more elegant way of doing it? May
be I missed something from the python API?

I would appreciate any input.

Cheers,
Markus


AtomMapping.ipynb
Description: Binary data
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral centres(?)

2017-02-08 Thread Greg Landrum
To further help with the work-around: it's safe to sanitize the molecule,
but you cannot call Chem.AssignStereochemistry(), which the SMILES parser
does when you tell it to sanitize.
Here's an example from your gist:

In [2]: m3 = Chem.MolFromSmiles('O[C@H]1CC[C@]11CC[C@@](Cl)(Br)CC1',
sanitize=False)

In [3]: for atom in m3.GetAtoms():
   ...: print("Stereo:", atom.GetChiralTag(), "Neighbours:",
[n.GetSymbol() for n in atom.GetNeighbors()])
   ...:
Stereo: CHI_UNSPECIFIED Neighbours: ['C']
Stereo: CHI_TETRAHEDRAL_CW Neighbours: ['O', 'C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_TETRAHEDRAL_CCW Neighbours: ['C', 'C', 'C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_TETRAHEDRAL_CW Neighbours: ['C', 'Cl', 'Br', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']

In [4]: Chem.SanitizeMol(m3)
Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [5]: for atom in m3.GetAtoms():
   ...: print("Stereo:", atom.GetChiralTag(), "Neighbours:",
[n.GetSymbol() for n in atom.GetNeighbors()])
   ...:
Stereo: CHI_UNSPECIFIED Neighbours: ['C']
Stereo: CHI_TETRAHEDRAL_CW Neighbours: ['O', 'C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_TETRAHEDRAL_CCW Neighbours: ['C', 'C', 'C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_TETRAHEDRAL_CW Neighbours: ['C', 'Cl', 'Br', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']
Stereo: CHI_UNSPECIFIED Neighbours: ['C', 'C']


-greg


On Wed, Feb 8, 2017 at 9:46 AM, James Davidson 
wrote:

> Hi Greg (et al.),
>
>
>
> Thanks for looking into it.
>
> And thanks to Paolo, who gave me a good workaround suggestion – which was
> to desymmetrise the spirocyclic centre by modifying the isotope on one of
> the neighbours.
>
> This is good for attended processing of single molecules, but not so good
> for unattended processing of unknown molecules…
>
>
>
> Reading in molecules with sanitize=False is a good start, but my first
> thought was then to do some sort of rSMARTS transform to automate the
> isomer assignment.
>
> It soon became apparent that this wasn’t the way to go – as abilities are
> limited with an unsanitised molecule(!).
>
>
>
> So I ended-up with the following:
>
>
>
> m3 = Chem.MolFromSmiles('O[C@H]1CC[C@]11CC[C@@](Cl)(Br)CC1',
> sanitize=False)
>
> for atom in m3.GetAtoms():
>
> print "Stereo:", atom.GetChiralTag(), "Neighbours:", [n.GetSymbol()
> for n in atom.GetNeighbors()]  # chiral centres currently intact
>
>
>
> # Find possible spirocentres
>
> for atom in m3.GetAtoms():
>
> if len(atom.GetNeighbors()) == 4 and atom.IsInRing() and
> atom.GetChiralTag() != 'CHI_UNSPECIFIED':
>
> # We have found a candidate spirocentre modify a neighbour at
> random
>
> first_neighbour = atom.GetNeighbors()[0]
>
> first_neighbour.SetIsotope(100)
>
> Chem.SanitizeMol(m3)  # Now we can sanitise
>
> test3_mols = summarise_conformers(m3)  # and generate the conformers (as
> before)
>
> sdf = Chem.SDWriter('test3.sdf')  # and write them out (but resetting the
> isotopes first)
>
> for mol in test3_mols:
>
> for atom in mol.GetAtoms():
>
> if atom.GetIsotope() == 100:
>
> atom.SetIsotope(0)
>
> sdf.write(mol)
>
>
>
>
>
> GIST is updated to include this:  https://gist.github.com/jepdavidson/
> fdfbf6366a17f4829de3d4de22f3b442
>
>
>
> Kind regards
>
>
>
> James
>
>
>
>
>
> *From:* Greg Landrum [mailto:greg.land...@gmail.com]
> *Sent:* 08 February 2017 03:45
> *To:* James Davidson
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* Re: [Rdkit-discuss] Stereochemistry issue for
> spirocycles/pseudochiral centres(?)
>
>
>
> Hi James,
>
>
>
> This is definitely a bug. The problem seems to be connected to the way
> what the RDKit calls "ring stereochemistry" is handled when there are spiro
> linkages.
>
>
>
> Here's the github issue: https://github.com/rdkit/rdkit/issues/1294
>
>
>
> I'll take a look.
>
>
>
> Best,
>
> -greg
>
>
>
>
>
>
>
> On Tue, Feb 7, 2017 at 8:32 PM, James Davidson 
> wrote:
>
> Dear All,
>
>
>
> I have hit what I think is a problem with stereochemistry
> perception/handling for certain types of pseudochiral and/or spirocyclic
> systems.
>
> Basically I am observing that some types of input tetrahedral
> stereochemical information gets lost when an RDKit molecule is generated.
>
> But I only realised this because I was wanting to generate conformers and
> was seeing stereochemical scrambling…
>
>
>
> Anyway, an example with pictures will probably explain things better:
>
> 

Re: [Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral centres(?)

2017-02-08 Thread James Davidson
Hi Greg (et al.),

Thanks for looking into it.
And thanks to Paolo, who gave me a good workaround suggestion – which was to 
desymmetrise the spirocyclic centre by modifying the isotope on one of the 
neighbours.
This is good for attended processing of single molecules, but not so good for 
unattended processing of unknown molecules…

Reading in molecules with sanitize=False is a good start, but my first thought 
was then to do some sort of rSMARTS transform to automate the isomer assignment.
It soon became apparent that this wasn’t the way to go – as abilities are 
limited with an unsanitised molecule(!).

So I ended-up with the following:

m3 = Chem.MolFromSmiles('O[C@H]1CC[C@]11CC[C@@](Cl)(Br)CC1', sanitize=False)
for atom in m3.GetAtoms():
print "Stereo:", atom.GetChiralTag(), "Neighbours:", [n.GetSymbol() for n 
in atom.GetNeighbors()]  # chiral centres currently intact

# Find possible spirocentres
for atom in m3.GetAtoms():
if len(atom.GetNeighbors()) == 4 and atom.IsInRing() and 
atom.GetChiralTag() != 'CHI_UNSPECIFIED':
# We have found a candidate spirocentre modify a neighbour at random
first_neighbour = atom.GetNeighbors()[0]
first_neighbour.SetIsotope(100)
Chem.SanitizeMol(m3)  # Now we can sanitise
test3_mols = summarise_conformers(m3)  # and generate the conformers (as before)
sdf = Chem.SDWriter('test3.sdf')  # and write them out (but resetting the 
isotopes first)
for mol in test3_mols:
for atom in mol.GetAtoms():
if atom.GetIsotope() == 100:
atom.SetIsotope(0)
sdf.write(mol)


GIST is updated to include this:  
https://gist.github.com/jepdavidson/fdfbf6366a17f4829de3d4de22f3b442

Kind regards

James


From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 08 February 2017 03:45
To: James Davidson
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral 
centres(?)

Hi James,

This is definitely a bug. The problem seems to be connected to the way what the 
RDKit calls "ring stereochemistry" is handled when there are spiro linkages.

Here's the github issue: https://github.com/rdkit/rdkit/issues/1294

I'll take a look.

Best,
-greg



On Tue, Feb 7, 2017 at 8:32 PM, James Davidson 
> wrote:
Dear All,

I have hit what I think is a problem with stereochemistry perception/handling 
for certain types of pseudochiral and/or spirocyclic systems.
Basically I am observing that some types of input tetrahedral stereochemical 
information gets lost when an RDKit molecule is generated.
But I only realised this because I was wanting to generate conformers and was 
seeing stereochemical scrambling…

Anyway, an example with pictures will probably explain things better:
https://gist.github.com/jepdavidson/fdfbf6366a17f4829de3d4de22f3b442

Any help/advice appreciated.

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or 
postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the 
"Company address and registration details" link at the bottom of the page..
__

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and