Re: [Rdkit-discuss] isotopic SMILES

2017-02-06 Thread Andrew Dalke
On Feb 7, 2017, at 01:17, Curt Fischer  wrote:
> I am confused by this behavior:
> 
> >>> labeled_etoh = Chem.MolFromSmiles('C[13C]O')
> >>> print(Chem.MolToSmiles(labeled_etoh))
> 
> C[C]O
> 
> >>> print(Chem.MolToSmiles(labeled_etoh, isomericSmiles=True))
> 
> C[13C]O
> 
> 1. Why are there any brackets at all in the first output?  Why not just 'CCO'?

The middle atom in "CCO" has two hydrogens. The middle atom in "C[C]O" has no 
hydrogens.

> 2. Is there any documentation anywhere that the "isomericSmiles" argument is 
> also an "isotopicSmiles" argument?

I don't believe so. A search via DuckDuckGo of rdkit.org finds only two 
irrelevant matches.

> I am also confused about when Chem.MolToSmiles() puts in H atoms in the 
> output.

SMILES has a short-hand notation to represent hydrogens. "[CH4]" and "C" are 
both methane.

When atom is described using brackets then the number of hydrogens must be 
specified with the H notation.

When an atom is described without brackets then the number of hydrogens is 
based on the permitted valence values. C has a valence of 4, -C- has two single 
bonds, so the middle carbon of CCO has two hydrogen bonds to complete the 
valence.

The output mechanism prefers to use the short-hand notation if possible. That 
isn't possible if the sum of hydrogens and bond types is different than one of 
the valence levels, or if there is an isotope, charge, chiral, etc., which 
requires the use of []s.

> 
> >>> three_hb1 = Chem.MolFromSmiles('C[13CH](O)C[13C](=O)O')
> >>> three_hb2 = Chem.MolFromSmiles('C[13C](O)C[13C](=O)O')
> >>> print(Chem.MolToSmiles(three_hb1, isomericSmiles=True))
> 
> C[13CH](O)C[13C](=O)O
> 
> >>> print(Chem.MolToSmiles(three_hb2, isomericSmiles=True))
> 
> C[13C](O)C[13C](=O)O
> 
> >>> print(Chem.MolToSmiles(three_hb1, isomericSmiles=False))
> 
> CC(O)CC(=O)O
> 
> >>> print(Chem.MolToSmiles(three_hb2, isomericSmiles=False))
> 
> C[C](O)CC(=O)O
> 
> 3. Why are there no brackets for three_hb1 output, but there are for 
> three_hb2?


I think you mean "for the isomericSmiles=False" output? The first three_hb1 
output has brackets.

The isotope notation requires []s, so the option of using the short-hand 
notation doesn't exist. In that case the number of hydrogens must be specified 
as otherwise it means the atom has no hydrogens.


> 4. As far as I can tell, the two three_hb molecules are identical.   Why 
> aren't all Hs removed during canonicalization?

The second atom in three_hb1 has 1 hydrogen and three single bonds.

The second atom in three_hb2 has 0 hydrogens and three single bonds.

They are different structures so have different SMILES.

Cheers,

Andrew
da...@dalkescientific.com



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] chirality assignment

2017-02-06 Thread Suzuki, Rintarou
Dear All,

I'm generating conformation of a molecule:

C1C2C3OC3C1C13OC21C1CC3C2OC21


This molecule has many chiral centers and 10 possible isomers.

EmbedMolecule command of RDKit_2015_03_1 can generate every isomer but 
RDKit_2016_09_3 fails in 9 of 10.

For example,

RDKit_2015_03_1
-
>>> mol=Chem.MolFromSmiles('C1[C@H]2[C@@H]3O[C@@H]3[C@@H]1[C@@]13O[C@@]21[C@@H]1C[C@H]3[C@@H]2O[C@@H]21')
>>> Chem.FindMolChiralCenters( copy(m), includeUnassigned=True )

[(1, 'S'), (2, 'S'), (4, 'R'), (5, 'R'), (6, 'R'), (8, 'R'), (9, 'R'), (11, 
'S'), (12, 'S'), (14, 'R')]

>>> m=Chem.AddHs(mol)
>>> AllChem.EmbedMolecule( m, randomSeed = 256, maxAttempts = 1, clearConfs = 
>>> False )

0
-


RDKit_2016_09_3
-
>>> mol=Chem.MolFromSmiles('C1[C@H]2[C@@H]3O[C@@H]3[C@@H]1[C@@]13O[C@@]21[C@@H]1C[C@H]3[C@@H]2O[C@@H]21')
>>> m=Chem.AddHs(mol)
>>> Chem.FindMolChiralCenters( copy(m), includeUnassigned=True )

[(1, 'S'), (2, 'S'), (4, 'R'), (5, 'R'), (6, 'R'), (8, 'R'), (9, 'R'), (11, 
'S'), (12, 'S'), (14, 'R')]

>>> AllChem.EmbedMolecule( m, randomSeed = 256, maxAttempts = 100, clearConfs = 
>>> False )

-1
-


Two chiral centers in this molecule are stereo-dependent (6th and 8th atoms).
Conformation of molecule without assignment for these atoms can be generated, 
but the chiralities remain unassigned.

RDKit_2016_09_3
-
>>> mol=Chem.MolFromSmiles('C1[C@H]2[C@@H]3O[C@@H]3[C@@H]1C13OC21[C@@H]1C[C@H]3[C@@H]2O[C@@H]21')
>>> Chem.FindMolChiralCenters( copy(m), includeUnassigned=True )

[(1, 'S'), (2, 'S'), (4, 'R'), (5, 'R'), (6, '?'), (8, '?'), (9, 'R'), (11, 
'S'), (12, 'S'), (14, 'R')]

>>> m=Chem.AddHs(mol)
>>> AllChem.EmbedMolecule( m, randomSeed = 256, maxAttempts = 100, clearConfs = 
>>> False )

0

>>> Chem.AssignAtomChiralTagsFromStructure(m)
>>> Chem.FindMolChiralCenters( copy(m), includeUnassigned=True )

[(1, 'S'), (2, 'S'), (4, 'R'), (5, 'R'), (6, '?'), (8, '?'), (9, 'S'), (11, 
'R'), (12, 'S'), (14, 'R')]

>>> opt = AllChem.UFFOptimizeMolecule( m, maxIters = 1, confId=0)
>>> ff = AllChem.UFFGetMoleculeForceField( m, confId = 0 )
>>> ff.Minimize()

0

>>> Chem.AssignAtomChiralTagsFromStructure(m)
>>> Chem.FindMolChiralCenters( copy(m), includeUnassigned=True )

[(1, 'S'), (2, 'S'), (4, 'R'), (5, 'R'), (6, '?'), (8, '?'), (9, 'R'), (11, 
'S'), (12, 'S'), (14, 'R')]
-

How can I assign chiralities of these atoms in RDKit_2016_09_3?


Regards,
Rintarou


Suzuki, Rintarou
National Agriculture and Food Research Organization
Tsukuba, Japan

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] isotopic SMILES

2017-02-06 Thread Curt Fischer
Hellow rdkit users,

What behavior should we expect for Chem.MolToSmiles() when dealing with
isotopically substituted molecules?


I am confused by this behavior:

>>> labeled_etoh = Chem.MolFromSmiles('C[13C]O')
>>> print(Chem.MolToSmiles(labeled_etoh))

C[C]O


>>> print(Chem.MolToSmiles(labeled_etoh, isomericSmiles=True))

C[13C]O


1. Why are there any brackets at all in the first output?  Why not just 'CCO
'?
2. Is there any documentation anywhere that the "isomericSmiles" argument
is also an "isotopicSmiles" argument?

I am also confused about when Chem.MolToSmiles() puts in H atoms in the
output.

>>> three_hb1 = Chem.MolFromSmiles('C[13CH](O)C[13C](=O)O')
>>> three_hb2 = Chem.MolFromSmiles('C[13C](O)C[13C](=O)O')
>>> print(Chem.MolToSmiles(three_hb1, isomericSmiles=True))


C[13CH](O)C[13C](=O)O


>>> print(Chem.MolToSmiles(three_hb2, isomericSmiles=True))

C[13C](O)C[13C](=O)O


>>> print(Chem.MolToSmiles(three_hb1, isomericSmiles=False))

CC(O)CC(=O)O


>>> print(Chem.MolToSmiles(three_hb2, isomericSmiles=False))


C[C](O)CC(=O)O


3. Why are there no brackets for three_hb1 output, but there are for
three_hb2?
4. As far as I can tell, the two three_hb molecules are identical.   Why
aren't all Hs removed during canonicalization?

Curt
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Looking for a bit of testing for py27 on windows

2017-02-06 Thread Greg Landrum
Dear all,
I'd like to try an experiment with the windows build of the new RDKit patch 
release (2016.09.4): instead of using the (ancient) recommended compiler for 
the conda build, I have done a build using the most recent version of visual 
studio (VS2015). It would make life significantly easier if we could use this 
as the standard solution for doing windows builds. I've tested on both Windows 
10 and Windows 7, but before I put it on the normal conda site, I'll like to be 
sure that it works for others too.
If you're a conda user and are willing to help out, please try creating a new 
conda environment with python 2.7 and installing the rdkit like this: "conda 
install -c greglandrum rdkit".  If conda has problems finding boost, you may 
need to: "conda install -c rdkit boost" first. 
If you do give it a try, please let me know, whether it works or not.
One thing to try if you do encounter problems, please try installing the VS2015 
redistributable DLLs from Microsoft: 
https://www.microsoft.com/en-us/download/details.aspx?id=48145 (these are 
normal DLLs, nothing odd).
Thanks, in advance, for any feedback!-greg


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss